PHYSICS
PHYSICS
PHYSICS
Dr. P. LeClair
General Physics II
Department of Physics and Astronomy
The University of Alabama
Copyright
c 2007, 2008 by Patrick R. LeClair.
This material may be distributed only subject to the terms and conditions set forward in the Open
Publication License, v1.0 or later. The latest version is presently available at:
http://www.opencontent.org/openpub/
Distribution of substantively modified versions of this document is prohibited without the explicit
permission of the copyright holder. Distribution of the work or derivative of the work in any
standard (paper) book form is prohibited unless prior permission is obtained from the copyright
holder.
Contents
I Relativity 11
2 Relativity 12
2.1 Frames of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Moving Frames of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Lack of a Preferred Reference Frame . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Relative Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.3 Invariance of the Speed of Light . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4 Principles of special relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Consequences of Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Lack of Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.3 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.4 Time and position in different reference frames . . . . . . . . . . . . . . . . . 35
2.3.5 Addition of Velocities in Relativity . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Mass, Momentum, and Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4.1 Relativistic Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4.2 Relativistic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4.3 Relativistic Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.6 Quick Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.8 Solutions to Quick Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.9 Solutions to Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
ii
CONTENTS iii
7 Magnetism 209
7.1 Magnetic Fields and Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.1.1 The Magnetic Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
CONTENTS v
11 Mirrors 319
11.1 Flat Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
11.1.1 Image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
11.1.2 Ray Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
11.1.3 Conventions for Ray Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . 323
11.1.4 Handedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
11.2 Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
11.2.1 Concave Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
11.2.2 Convex Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
11.3 Ray Diagrams for Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
11.4 Parabolic Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
11.5 Quick Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
11.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
CONTENTS vii
12 Lenses 338
12.1 Quick Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
12.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
12.3 Solutions to Quick Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
12.4 Solutions to Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
A BamaLab 341
A.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
A.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
A.2.1 LabJack U3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
A.2.2 Measuring Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
A.2.3 Measuring Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
A.2.4 Sourcing Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
A.2.5 Sourcing Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
A.2.6 Finished Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
A.3 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
A.3.1 Multimeter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
A.3.2 Current vs. Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
A.3.3 Step Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
A.3.4 Oscilloscope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
A.3.5 Example measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
A.4 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
A.5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Bibliography 358
List of Figures
viii
LIST OF FIGURES ix
A.1 Schematic of the Input and Output Connections to the LabJack U3 . . . . . . . . . 344
A.2 Voltage-Current Converter Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
A.3 A finished student laboratory box. . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
A.4 BamaLab Main Application Window . . . . . . . . . . . . . . . . . . . . . . . . . . 347
A.5 BamaLab I(V ) Measurement Module . . . . . . . . . . . . . . . . . . . . . . . . . . 349
A.6 BamaLab Step-function Response Module . . . . . . . . . . . . . . . . . . . . . . . 350
A.7 BamaLab Oscilloscope Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
A.8 Example Measurements with the BamaLab . . . . . . . . . . . . . . . . . . . . . . 353
List of Tables
7.1 Relative Permeabilities and Remnant Fields of Some Magnetic Materials . . . . . . . 237
xiii
1
Notation, Numbers, and Units
from105
notes!
T HIS chapter is only meant to introduce the notation we will use in the remainder of these
notes and provide some useful reference information. At the end of this Appendix, you will
find useful tables and online references.
We have tried to be as consistent as possible through these notes in using the same symbols
and units for the same physical quantities and constants. In some cases, we have used slightly
different symbols or fonts where confusion may arise (e.g., P for power, and P for pressure), or
introduced subscripts to differentiate between similar quantities (e.g., ΦB for magnetic flux, and
ΦE for electric flux). The tables at the end of this chapter are not necessarily exhaustive, but list
most of the symbols and notation used throughout these notes.
When there is more than one equivalent unit that can be used, the preferred one (if any) is in
bold. Of course, there are many other units in use in different fields for all of these quantities. We
are not interested in those specialized units, but only standard SI units. Note that all units can be
traced back to a combination of kg, m, s, and C - for example, electric field strength is [N/C] =
[V/m] = m·kg/s2 ·C.
These boxes tell you the units of various quantities, values of physical constants, and
formulas you should already known.
These stylish boxes pose questions based on the surrounding material, give additional
tidbits of information (like useful web pages), and provide real-world examples. Some of
them also provide problem-solving hints.
1
2 1.3 How we handle numbers
In these boxes you will find any other sort of information, like definitions and bulleted
lists.
1.4 Units
Almost without exception, we will use the International System of Units - SI units (Systèm inter-
national d’unités), the modern form of the metric system. The SI consists of a set of units, together
with a set of prefixes to indicate powers of ten. In the end, there are only seven base SI units,
and from these seven base units many others can be derived. When there is potential for confusion
between variables or symbols and units, we will often enclose the units in square brackets, e.g., [m]
or [kg · m/s2 ].
In the tables at the end of this chapter, we list the base SI units and a few common “derived”
units. Keep in mind that unless a problem specifically uses non-SI units (such as miles, inches,
or pounds), and asks for an answer in non-SI units (a very rare phenomenon), you should always
report your answers in SI units. A table at the end of this chapter lists the few non-SI units which
are generally accepted for use alongside SI units.
A prefix can be added to a particular unit (either base or derived) to indicate a power of ten
multiple of the original unit. The most common prefixes are also in the tables at the end of the
chapter. For instance, “kilo-” or “k” indicates a multiple of one thousand or 103 , so 1 km means
one kilometer, 1 × 103 m or 1000 m. Similarly, “milli-” or “m” indicates one one-thousanth (10−3 ),
so 1 mm means one millimeter, 1 × 10−3 m or 0.001 m. Prefixes are never combined - a millionth
of a kilogram is a milligram not a microkilogram.
See http://en.wikipedia.org/wiki/SI and http://physics.nist.gov/cuu/Units/units.
html for more information.
Guidelines: Treat units like variables. Only add like terms. When a unit is divided by itself, the
division yields a unitless one. When two different units are multiplied, the result is a new unit,
referred to by the combination of the units. For instance, in SI, the unit of speed is metres per
second (m/s). See dimensional analysis. A unit can be multiplied by itself, creating a unit with an
exponent (e.g., m2 /s2 ).
Some units have special names, however these should be treated like their equivalents. For
example, one newton (N) is equivalent to one kg·m/s2 . This creates the possiblity for units with
multiple designations, for example: the unit for surface tension can be referred to as either N/m
(newtons per metre) or kg/s2 (kilograms per second squared).
Dimensional analysis can be a powerful tool to see if things look correct or not. For example,
consider the following equation, Newton’s general law of gravitation. What are the units of G?
Here F is force in kilogram-meters per second squared ( kg·m
s2
), M and m are masses in kilograms
(kg), and r is a distance in meters (m). With dimensional analysis, we can check:
h i h i
kg·m kg·m
· m2 · m2
m 2 3
F r2 · m N · m2
GM m s2 s2 2 m
F = ⇒G= ⇒ = = s = =
r2 Mm kg2 kg2 kg [kg · s3 ] kg2
Table 1.1: Notation and Symbols Used. Preferred units in bold, when relevant.
Constants:
hc = 1239.84 eV · nm
ke = 8.98755 × 109 N · m2 · C−2 h
= 2.42631 × 10−12 m
µ0 ≡ 4π × 10−7 T · m/A me c
0 = 8.85 × 1012 C2 /N · m2
Quadratic formula:
e = 1.60218 × 10−19 C
√
h = 6.6261 × 10−34 J · s = 4.1357 × 10−15 eV · s −b ± b2 − 4ac
0 = ax2 + bx2 + c =⇒ x =
h 2a
~ =
2π
1 Basic Equations:
c = √ = 2.99792 × 108 m/s
µ0 0
~ net = m~
F a Newton’s Second Law
me− = 9.10938 × 10−31 kg = 0.510998 MeV/c2
2
mp+ = 1.67262 × 10−27 kg = 938.272 MeV/c2 ~ centr = − mv r̂ Centripetal
F
r
mn0 = 1.67493 × 10−27 kg = 939.565 MeV/c2 1
KE = (γ − 1) mc2 ≈ mv 2 Kinetic energy
1u = 931.494 MeV/c 2 2
KEinitial + PEinitial = KEfinal + PEfinal
Magnetism
~ B|
|F = ~ sin θvB charge q
v||B|
q|~ Electric Potential:
~ B|
|F = BIl sin θ wire
∆PE
∆V = VB − VA =
|~
τ| = BIAN sin θ torque current loop q
~ µ0 I q∆V = ∆PE
B = θ̂ wire
2πr ∆P E = ~ x| cos θ = −qEx ∆x
q∆V = −q|E||∆~
~ N
B = µ0 I ẑ ≡ µ0 nI ẑ solenoid ↑ constant E field
L
~ 12 | q
|F µ0 I1 I2 Vpoint charge = ke
= 2 wires, force per length r
l 2πd q1 q2
P Epair of point charges = ke
Current: r12
P Esystem = sum over unique pairs of charges
∆Q
I = = nqAvd −W = ∆PE = q(VB − VA )
∆t
I
J = = nqvd Optics:
A
−eτ
vd = E τ = scattering time hc 1239.84 eV · nm
m E = hf = =
m λ λ (nm)
% = speed of light in vacuum c
ne2 τ n = =
%l speed of light in a medium v
∆V = I = RI
A λ1 v1 c/n1 n2
= = = refraction
%l ∆V λ2 v2 c/n2 n1
R = =
A I λ1 n1 = λ2 n2 refraction
P = I∆V = I 2 R = IV power n1 sin θ1 = n2 sin θ2 Snell’s refraction
Resistors: ac Circuits
∆V
IV source = τ = L/R RL circuit
R+r
R τ = RC RC circuit
∆VV source = ∆Vrated
r+R 1
XC = “resistance” of a capacitor for ac
r 2πf C
II source = Irated
r+R XL = 2πf L “resistance” of an inductor for a
Req, series = R1 + R2
R1 R2 Nuclear
Req, par =
R1 + R2
E2 = p2 c2 + m2 c4
Vectors:
alpha particle = 4
2α =42 He beta particle =−10 β = e−
q
~
2 3
|F| = Fx2 + Fy2 magnitude X
» – Binding Energy = 4 mc2 5 − matom c2
Fy
θ = tan−1 direction p+ &n0
Fx
Quantum & Atomic
Relativity
h
1 λout − λin = (1 − cos θ)
γ = q me c
v2
1− c2 h h h
λ = = ≈
∆t = γ∆tp |~
p| γmv mv
h
L = Lp /γ ∆x∆p ≥
4π
p = γmv h
v2 + v10 ∆E∆t ≥
v1 = 4π
1 + v10 v2 /c2 En = −13.6 eV/n2
(γ − 1)mc2
!
KE = 1 1
Ei − Ef = −13.6 eV − 2 = hf Hydrogen only
Etot = γmc2 = KE + mc2 n2f ni
Erest = mc2 mvr = n~
2 2 2 2 4 2 n2 ~2 ke e2
E = p c +m c v = =
m2e r2 me r
Relativity
11
2
Relativity
N EARLY all of the mechanical phenomena we observe around us every day have to do with
objects moving at speeds rather small compared to the speed of light. The Newtonian
mechanics you learned in previous courses handled these cases extraordinarily well. As it turns
out, however, Newtonian mechanics breaks down completely when an object’s speed is no longer
negligible compared to the speed of light. Not only does Newtonian mechanics fail in this situation,
it fails spectacularly, leading to a variety of paradoxical situations.
The resolution to these paradoxes is given by the theory of relativity, one of the most successful
and accurate theories in all of physics, which we will introduce in this chapter. Nature is not always
kind, however, and the consequences of relativity seem on their face to flout common sense and our
view of the world around us. We are used to the notion that our position changes with time when
we are in motion, but relativity implies that passage of time itself changes when we are in motion.
Nevertheless, we shall see that relativity is an inescapable consequence of a few simple principles
and experimental facts. Moreover, as it turns out, this new description of nature is critical for
properly understanding electricity and magnetism, optics, and nuclear physics ... most of the rest
of this course!
(a)
What happens if we if we instead choose a different coordinate system O0 , Fig. 2.2c, identical to
O except that its origin is shifted downward by yi and to the left by xi ? Now the initial and final
0 0
positions of the person are PiO = (xi , yi ) and PfO = (xf , yf ). Still, the displacement ∆x is the same,
as you can easily verify. No matter whether we observe the person from the O or O0 system, we
would describe the same displacement, even though the actual positions are completely different.
In special relativity, this simple situation no longer holds - observers in different coordinate
systems do not necessarily describe even the same displacement, much less the same position.
Fortunately, the corrections of special relativity to the Newtonian mechanics you have already
learned are only appreciable at very high velocities (non-negligible compared to the speed of light),
and for most every day situations our usual intuition is still valid.
In any case, particularly those cases where relativistic effects are important, it is crucially impor-
tant that we specify in which coordinate system quantities have been measured. We will continue
to do this with a superscript of some sort to specify the coordinate system, and a subscript of
some sort to further describe what is being measured within that system. When we only have two
frames, like the example above, we will often just use a prime (0 ) to tell them apart. In the previous
0
example, this means we would use Pf0 instead of PfO , and just Pi instead of PiO . It seems pedantic
now, but careful bookkeeping is the only thing saving us from terrible confusion later!
xO O
final = xf final x position of an object measured in the O coordinate system
0
O ≡ v0 0
vcar car velocity of a car measured in the O coordinate system
0
PfO ≡ Pf0 = (x0f , yi0 ) final position of an object measured in the O0 coordinate system
!vdart
Figure 2.3: A girl holding balloons is standing on the
ground, at rest in reference frame O (vgirlO = 0). Mean-
O y O’ y!
x x!
First of all, we have to be more explicit about specifying which quantity is measured in which
frame. The velocity of the bully on the skateboard is measured relative to the girl standing on
the ground, in the O system, so we write vbully O . When we talk about the dart, however, things
are a bit less clear. The bully on the skateboard would say that the velocity of the dart is vdart O0 ,
since he would measure its velocity relative to himself in the O0 frame. The girl would measure the
O . Clearly, v O0 6= v O
velocity of the dart relative to herself in the O frame, vdart dart dart – in principle, the
two cannot agree on what the velocity of the dart is! Of course, that is a bit of an exaggeration.
In this simple everyday case, relative motion is fairly easy to understand, and we can intuitively
see exactly what is happening. Our intuition will start to fail us shortly, however, so it is best we
proceed carefully.
Explicitly labeling the velocity with the reference frame in which it is measured helps keep
everything precise, and helps us find a way out of this conundrum. It may seem like baggage now,
but ambiguity would cost us dearly later. Just to summarize, here is how we will keep the velocities
straight:
O
vbully = velocity of bully measured from the ground ≡ vbully
O0 0
vdart = velocity of dart measured from the skateboard ≡ vdart
O
vdart = velocity of the dart measured by the girl ≡ vdart
Whenever we are only dealing with two different coordinate systems, we will trim down the
notation a bit. We will just call one system the “primed” system, and add a 0 superscript to all
quantities, and leave the other one as the “unprimed” system, and drop the ‘O’. Which one we
call “primed” and which one is “unprimed” makes no difference, it is after all just notation and
bookkeeping.
What does the girl on the ground, in the O system really observe? Intuitively, we expect this
her to see the dart moving at a velocity vdart which is that of the dart relative to the skateboard
plus that of skateboard relative to the ground:
0
vdart = vdart + vbully (2.1)
velocity of the dart seen by the girl = velocity of dart relative to skateboard + velocity of skateboard relative to girl
The bully, in the O0 system (who threw the dart in the first place), just sees vdart
0 . Just to be
concrete, let’s say that the bully on the skateboard moves with vbully = 3 m/s, and he throws the
0
dart with vdart = 2 m/s. Then the girl sees the dart coming at her balloons at 5 m/s.
Even in the simple example above, velocity depends on your frame of reference. This simple example
is completely arbitrary in a sense, though, and implies much more about relative motion. If these
two observers can’t agree on the velocity of the dart, as measured in their own reference frames,
who is to say what the absolute reference frame should be? After all, isn’t the ground itself moving
due to the rotation of the earth about the sun? And isn’t the sun moving relative to the center of
the galaxy? Nothing is absolutely at rest, we cannot pick any special frame of reference to define
Luminiferous æther
Unfortunately, this idea just isn’t right. It has been disproven countless times by experiments,
and replaced by the far more successful theory of relativity. Light waves are not like sound waves.
Light is in one sense a wave, but a more modern viewpoint treats light as a stream of particles that
have a “wave-like nature.”i Particles do not need a medium to travel, and therefore neither does
light. There is no æther, and there is no preferred frame of reference. All motion is relative.
i
We will explore this dual nature of light in Chapter 10.
Fine. There is no preferred reference frame or coordinate system, and all motion is relative. So
what? The example of Fig. 2.3 was plainly understandable. It is disturbingly easy to come up with
examples which are not so plainly understandable, however, which is one motivation for the theory
of relativity in the first place. Consider the two rockets in empty space traveling toward each other
in Fig. 2.5, separated by a distance ∆x. The pilot of rocket 1 might say he or she is traveling at
a speed v1 in his or her own reference frame (O), and the pilot of rocket 2 may claim he or she is
traveling at a speed v20 in their own O0 . Without specifying what point they are measuring their
velocity relative to, can we say who is moving at what speed?
We have to imagine that we are deep in empty space, with nothing around either rocket to
provide a landmark or point of reference. The occupants of rocket 1 would feel as though they are
sitting still, and observe rocket 2 coming toward them, covering a distance (v1 + v2 )∆t in a time
interval ∆t. The occupants of rocket 2, on the other hand, would think they are sitting still, and
would observe rocket 1 coming toward them, also covering a distance (v1 +v2 )∆t in a time interval
∆t.
Without any external reference point, or an absolute frame of reference, not only can we not say
with what speed each rocket is moving, we can’t even say who is moving! If we decide that rocket
1 is our reference frame, then it is sitting still, and rocket 2 is moving toward it. But we could just
as easily pick rocket 2 as our reference frame. Specifying who is moving, and with what speed, is
meaningless without a proper origin or frame of reference.
Has anything really changed physically? No. An analogy of sorts is to think about driving
along side other cars on the highway, keeping pace with them. You might report your speed as
60 mi/hr. Relative to what? Clearly, in this case it is implied that the ground beneath you provides
a reference frame, and you are talking about your velocity relative to the earth. You wouldn’t say
you are traveling at 60 mi/hr relative to the other cars (we hope) – your speed relative to the other
cars is zero if you are staying along side them. Indeed, if you look out your window, the cars
next to you appear to be sitting still. This is only true at constant velocity – we can easily detect
accelerated motion, or an accelerated frame of reference due to the force experienced. This is the
realm of general relativity, Sect. 2.5.
In the end, one of the fundamental principles of special relativity is that the description of
relative constant velocity does not matter, so far as the laws of physics are concerned. The laws of
physics apply the same way to all objects in uniform (non-accelerated) motion, no matter how we
measure the velocity. We cannot devise an experiment to measure uniform motion absolutely, only
relative to a specific chosen frame of reference. More succinctly:
Principle of relativity:
All laws of nature are the same in all uniformly moving (non-accelerating) frames of
reference. No frame is preferred or special.
As another simple example, Fig. 2.6, consider Joe and Moe running at different (constant)
speeds in the same direction, initially separated by a distance do . Without specifying any particular
common frame of reference, we must be able to describe their relative motion, or how the separation
between Joe and Moe changes with time, even though we can’t speak of their absolute velocities
in any sense.
!vJoe !vMoe
Joe do Moe
Let’s say we arbitrarily choose Joe’s position at t = 0 as our reference point. It is easy then to
write down what Joe and Moe’s positions are at any later time interval ∆t:
We can straightforwardly write down the separation between them (their relative displacement) as
well:
Sure enough, their relative displacement only depends on their relative velocity, vMoe−vJoe . Further,
both Joe and Moe would agree with this, since we could arbitrarily choose Moe’s position at t = 0
as our reference point, and we would end up with the same answer. Since there is nothing special
about either position, we can choose any point whatsoever as a reference, and wind up with the
same result. We end up with the same physics no matter what reference point we choose, which
one we choose is all a matter of convenience in the end.
This seems simple enough, but if we think about this a bit longer, more problems arise. Who
measures the initial separation d0 , Joe or Moe? Who keeps track of the elapsed time ∆t? Does it
matter at all, can the measurement of distance or time be affected by relative motion? Of course,
the answer is an awkward ‘yes’ or we would not dwell on this point. If we delve deeper on the
problem of relative motion, we come to the inescapable conclusion that not only is velocity a relative
concept, our notions of distance and time are relative as well, and depend on the relative motion
of the observer. In order to properly understand these deeper ramifications, however, we need to
perform a few more thought experiments.
Already, relativity has forced us to accept some rather non-intuitive facts. This is only the be-
ginning! A more fundamental and far-reaching principle of relativity is that the speed of light is a
constant, independent of the observer. No matter how we measure it, no matter what our motion
is relative to the source of the light, we will always measure its velocity to be the same value, c.
Light does not obey the principle of relative motion!
c = 3 × 108 m/s
There is a relatively simple way to experimentally demonstrate that this seems to be true,
depicted in Fig. 2.7. The earth itself is in constant motion in its orbit around the sun, moving at
∼ 3 × 104 m/s measured relative to distant stars (this in itself is a measurable quantity). Imagine
now that we carefully set up three lasers, each oriented in a different direction relative to earth’s
orbital velocity – one parallel (A), one antiparallel (B), and one at a right angle (C). We will further
set up each laser to emit short pulses of light, and carefully measure the time between pulses. In
this way, we can determine the speed of the light coming out of each laser.
Based on simple Newtonian mechanics and velocity addition, we would expect to measure a
slightly different velocity for each laser. In case A, we would expect the Earth’s velocity to add to
that of light, vA = vlight +vorbit , while in case B, it should subtract, vB = vlight −vorbit . In case C, we
have to add vectors, ~ vC =~vlight +~vorbit , but the idea is the same.
!vorbit
laser
!vA chanics, then measuring the speed of light from a laser
pointed in different directions compared to Earth’s or-
!vB bital velocity should yield different results. In case A,
laser
The effect should be small (∼ 0.01 %), but easily measurable. No effect is observed, the speed of
light is always the same value c. This experiment has been performed with increasingly fantastic
precision over the last 100 years 4 , and no matter what direction we shine the light, we always
measure the same speed! (The current best limit 4 on the constancy of the speed of light is about 1
part in 1016 .) One straightforward result of this experiment is that the idea of an Æther is clearly
not right, as we discussed above. There are much more far-reaching consequences, which we must
consider carefully. First, let us re-iterate this idea more formally:
The speed of light is invariant
The speed of light in free space is independent of the motion of the source or observer. It
is an invariant constant.
This is not just idle speculation or theory, it has been confirmed again and again by careful
experiments. These experiments have established, for instance, that the speed of lightii does not
depend on the wavelength of light, on the motion of the light source, or the motion of the observer.
As examples, lack of a wavelength dependence can be strongly ruled out by astronomical observa-
tions of gamma ray bursts (to better than 1 part in 1015 ), while binary pulsars can rule out any
dependence on source motion. The lack of a dependence on observer motion was disproved along
with the æther (Sec. 2.2.1), which also proved that light requires no medium for propagation.
As an example of this, we turn again to Joe and Moe (Fig. 2.8). Joe is in a rocket (O0 ), traveling
at 90% of the speed of light (v = 0.9c), while Moe is on the ground (O) with a flashlight. Moe
shines the flashlight parallel to Joe’s trajectory in the rocket. On first sight, we would think that
Moe would measure the speed of the light leaving the flashlight as c, while Joe would measure
v = c−0.9c = 0.1c.
ii
Throughout this chapter, we refer to the speed of light in a vacuum.
O’ y!
x!
|!v | = 0.9c
Both Joe and Moe measure the same speed of light c, despite their relative motion! What if we
gave Joe the flashlight inside the rocket? No difference, both Joe and Moe measure the speed of
the light to be c. Think back to our example of relative motion in Fig. 2.3. It doesn’t seem to
make sense that light behaves differently, but that is how it is. As we shall see shortly, our normal
intuitions about everyday phenomena at relatively low velocities is no longer valid when velocities
approach that of light. The physics is fundamentally different, and our Newtonian instincts are in
the end only a low-speed approximation to reality. By the end of this chapter, though, we will be
armed with the proper tools to analyze this situation correctly from both viewpoints.
This theory of relativity restricted to inertial reference frames is known as the special theory
of relativity, while the more general theory of relativity which also handles accelerated reference
frames is simply known as the general theory of relativity (which we will touch on in Sect. 2.5).
The second postulate of special relativity - the invariance of the speed of light - can actually
be considered as a consequence of the first according to some mathematical formulations of special
relativity. That is, the constancy of the speed of light is required in order to make the first postulate
true. We will continue to hold it up as a second primary postulate of special relativity, however,
as some of the more non-intuitive consequences of special relativity are (in our view) more readily
apparent when one keeps this fact in mind.
The first principle of relativity essentially states that all physical laws should be exactly the
same in any vehicle moving at constant velocity as they are in a vehicle at rest. As a consequence,
at constant velocity we are incapable of determining absolute speed or direction of travel, we are
only able to describe motion relative to some other object. This idea does not extend to accelerated
reference frames, however. When acceleration is present, we feel fictitious forces that betray changes
in velocity that would not be present if we were at rest. All experiments to date agree with this
first principle: physics is the same in all inertial frames, and no particular inertial frame is special.
The principle of relativity is by itself more general than it appears. The principle of relativity
describes a symmetry in the laws of nature, that the laws must look the same to one observer as
they do to another. In physics, any symmetry in nature also implies a conservation law, such as
conservation of energy or conservation of momentum. If the symmetry is in time, such that two
observers at different times must observe the same laws of nature, then it is energy that must be
conserved. If two observers at different physical locations must observe the same laws of physics
(i.e., the laws of physics are independent of spatial translation), it is linear momentum that must be
conserved. The relativity principles imply deep conservation laws about space and time that make
testable predictions – predictions which must be in accordance with experimental observations in
order to be taken seriously. Relativity is not just a principle physicists have proposed, it is a
postulate that was in the required in order to describe nature as we see it. The consequences of
these postulates will be examined presently.
The speed of light is more than just a constant, it is a sort of ‘cosmic speed limit’ – no object can
travel faster than the speed of light, and no information can be transmitted faster than the speed
of light. If either were possible, causality would be violated: in some reference frame, information
could be received before it had been sent, so the ordering of cause-effect relationships would be
reversed. It is a bit much to go into, but the point is this: the speed of light is really a speed
limit, because if it were not, either cause and effect would not have their usual meaning, or sending
O’ y! O’ y! c∆t
O’ y!
x! x! x!
O y
Moe O y
Moe O y
Moe
x x x
Figure 2.9: left: Joe is traveling in a (transparent) rocket ship, and turns on a light bulb in the exact center of the rocket.
middle A short time ∆t later, in his frame O0 Joe sees the light rays hit both sides of the ship at the same time. right: Moe
on the ground observes Joe in his rocket moving at v = 0.9c. From his frame O, a time ∆t after the light leaves the bulb, the
ship moves forward by an amount c∆t but the light rays do not. Moe sees the light hit the back of the ship first – Moe and Joe
cannot agree on the simultaneity of events.
Now, what will Moe on the ground see?iii From his frame O, Moe sees the light emitted from
the bulb at t = 0. The ship and the light bulb are both moving relative to Moe at v = 0.9c, but
we have to be careful. First, Moe observes the same speed of light as Joe, even though the bulb
is moving. Once the light bulb is turned on, the first light leaves the bulb at v = c and diverges
radially outward from its point of creation. As this first light leaves the bulb, however, the ship is
still moving forward. The front of the ship moves away from the point of the light’s creation, while
the back moves toward it.
In some sense, once the light is created, it isn’t really in either reference frame – it is traveling
at v = c no matter who observes it. The ship moved forward, but the point at which the light was
iii
You might think nothing, as we neglected to mention that Joe’s ship is transparent.
created did not. We attempt to depict this in Fig. 2.9, where from Moe’s point of view, after a
time ∆t the light rays emitted from the bulb seem to have emanated from a point somewhat behind
the rocket – a distance c∆t behind it. Thus, after some time, Moe sees the light hit the back of the
ship first! Joe and Moe seem to observe different events, and they can not agree on whether the
light hits the front and the back of the ship simultaneously. Events which are simultaneous in Joe’s
reference frame are not in Moe’s reference frame, moving relative to him. Think about how this
plays out from Joe and Moe’s reference frames carefully. It is strange and non-intuitive, but if we
accept the speed of light as invariant, the conclusion is inevitable.
2d
∆t0Joe = (2.4)
c
So far so good. Since Joe is not moving relative to the mirrors, nothing unusual happens –
assuming he has superhuman vision, he just sees the light pulses bouncing back and forth between
the mirrors, Fig. 2.10a, straight up and down, and counts the number of round trips. Moe monitors
this situation from the ground, in his own reference frame O. Thankfully, the boxcar is transparent,
and Moe is able to see the light pulses and mirrors as well as the boxcar, moving at a velocity v
from his point of view.
y! O’ y! O’
d !
x |!v | = 0.9c x! |!v | = 0.9c
Joe Joe
y O Moe
y O Moe
x x
Figure 2.10: left: Joe is traveling in a (transparent) boxcar, and he bounces laser beams between two mirrors inside the
boxcar. Since the distance between the mirrors is known, and the speed of light is constant, he can measure time in this way.
Joe measures the round trip time it takes the light to bounce from the bottom mirror, to the top, and back again. right: Moe
observes the mirrors from the ground. From his frame O, the boxcar and mirrors are moving but the light is not. He therefore
sees the light bouncing off of the mirrors at an angle. Using geometry and the constant speed of light, Moe also measures a
round trip time interval, but since the path he observes for the light is different, he measures a different time interval than Joe.
What does Moe see inside the boxcar? From his point of view, a light pulse is created at the
bottom mirror while the whole assembly moves in the x direction – mirrors, light pulse, and all!
Just like in the example of the light being flicked on in a space ship, the boxcar and mirrors have
moved, but the point at which the light was created has not – Moe appears to see the light traveling
at an angle. A light pulse is created at the bottom mirror, and it travels upward horizontally to
reach the top mirror some time later, a bit further along the x axis. Rather than seeing the pulses
going straight up and down, from Moe’s point of view, they zig-zag sideways along the x axis, as
shown in Fig. 2.10b.
So what? We know the speed of light is a constant, so both Joe and Moe must see the light
pulses moving at a velocity c, even though they appear to be moving in along a different trajectory.
If Moe also uses the light pulses’ round trips to measure the passage of time, what time interval
does he measure? The speed of light is constant, but the apparent distance covered by the light
pulses is larger in Moe’s case. Not only has the light traveled in the y direction a distance 2d, over
the course of one round trip it has also moved horizontally due to the motion of the boxcar. If the
light has apparently traveled farther from Moe’s point of view, and the speed of light is constant,
then the apparent passage of time from Moe’s point of view must also be greater!
Just how long does Moe observe the pulse round trip to be? Let us examine one half of a round
trip, the passage of the light from the bottom mirror to the top. In that interval, from either
reference frame, the light travels a vertical distance of d. From Joe’s reference frame O0 , the light
does not travel horizontally, so the entire distance covered is just d, and he measures the time
interval 12 ∆t0Joe = d/c. From Moe’s reference frame, the car has also travelled horizontally. Since he
sees the car moving at a velocity v, he would say that in his time interval 12 ∆tMoe for one half round
trip, the car has moved forward by 12 v∆tMoe . Thus, Moe would see the light cover a horizontal
distance of 21 v∆tMoe and a vertical distance d, as shown in Fig. 2.11.
Thus, according to Moe, total distance that the light pulse covers in one half of a round trip is
the Pythagorean sum of the horizontal and vertical distances.
2
2 2 1
(distance observed by Moe) = d + v∆tMoe (2.5)
2
Further, he must also observe the speed of light to be c just as Joe does. If he measures the
passage of time by counting the light pulses as Joe does, then he would say that after one half
round trip, the light has covered this distance at a speed c, and would equate this with a time
interval in his own reference frame ∆t. Put another way, he would say that the distance covered by
the light in one half round trip is just 21 c∆tMoe , in which case we can rewrite the equation above:
2 2 2
1 1 1
c∆tMoe = c∆t0Joe + v∆t0Moe (2.6)
2 2 2
Now we see that if the speed of light is indeed constant, there is no way that the time intervals
measured by Joe and Moe can be the same! The pulse seems to take longer to make the trip from
Moe’s perspective, since it also has to travel sideways, not just up and down. Solely due to the
constant and invariant speed of light, Joe and Moe must measure different time intervals, and Moe’s
must be the longer of the two. We can solve the equation above to find out just what time interval
Moe measures:
2 2 2
1 1 0 1
c∆tMoe = c∆tJoe + v∆tMoe (2.7)
2 2 2
2
c2 (∆tMoe )2 = c2 ∆t0Joe + v 2 (∆t)2 (2.8)
2
(∆tMoe )2 c2 − v 2 = c2 ∆t0Joe
(2.9)
c2 2
(∆tMoe )2 = 2 2
∆t0Joe (2.10)
c −v
∆t0
∆tMoe = q Joe = (2.11)
2
1 − vc2
1
=⇒ ∆tMoe = ∆t0Joe q ≡ ∆t0Joe γ (2.12)
v2
1− c2
p
Here we defined a dimensionless quantity γ = 1/ 1 − v 2 /c2 to simplify things a bit, we’ll return
to that shortly. So long as v < c, the time interval that Moe measures is always larger than the
one Joe measures, by an amount which increases as the boxcar’s velocity increases. This is a
general result in fact: the time interval measured by an observer in motion is always longer than
that measured by a stationary observer. Typically, we say that the moving observer measures a
dilated time interval, hence this phenomena is often referred to as time dilation. The time dilation
phenomena is symmetric – if Moe also had a clock on the ground, Joe would say that Moe’s clock
runs slow by precisely the same amount. It is only the relative motion that matters.
Time dilation
The time interval ∆t between two events at the same location measured by an observer
moving with respect to a clock is always larger than the time interval between the same
two events measured by an observer stationary with respect to the clock. The ‘proper’
time ∆tp is that measured by the stationary observer.
1
∆t0moving = γ∆tstationary = γ∆tp where γ=q (2.13)
v2
1− c2
In other words, time is stretched out for a moving observer compared to one at rest.
In the example above, it is Moe who is in a reference frame moving relative to our light ‘clock’
and Joe is the stationary observer. Therefore, Joe measures the ‘proper’ time interval, while Moe
measures the dilated time interval. Incidentally, for discussions involving relativity, we basically
assume that there is always a clock sitting at every possible point in space, constantly measuring
time intervals, even though this is clearly absurd. What we really mean is the elapsed time that
a clock at a certain position would read, if we had one there. For the purpose of illustration, it is
just simpler to presume that everyone carries a fantastically accurate clock at all times.
The quantity dimensionless quantity γ is the ratio of the time intervals measured by the observers
moving (Moe) and stationary (Joe) relative to the events being timed. This quantity, defined by
Eq. 2.14, comes up often in relativity, and it is called the Lorentz factor. Since c is the absolute
upper limit for the velocity of anything, γ is always greater than 1. So long as the relative velocity
of the moving observer is fairly small relative to c, the correction factor is negligible, and we need
not worry about relativity (e.g., at a velocity of 0.2c, the correction is still only about 2%). In some
sense, the quantity γ is sort of a gauge for the importance of relativistic effects – if γ ≈ 1, relativity
can be neglected, while if γ is much above 1, we must include relativistic effects like time dilation.
Figure 2.12 provides a plot and table of γ versus v/c for reference. Note that as v approaches c, γ
increases extremely rapidly.
Lorentz factor γ:
1
γ=q ≥1 (2.14)
v2
1− c2
In the case above, for velocities much less than c, when γ ≈ 1, Eq. 2.13 tells us that both Joe
and Moe measure approximately the same time interval, just as our everyday intuition tells us. In
fact, for most velocities you might encounter in your everyday life, the correction factor γ is only
different from 1 by a miniscule amount, and the effects of time dilation are negligible. They are
note, however, unmeasurable or unimportant, as we will demonstrate in subsequent sections – time
dilation has been experimentally verified to an extraordinarily high degree of precision, and does
have some everyday consequences.
Before we discuss the stranger implications of time dilation, it is worth discussing at least one
practical example in which the consequences of time dilation are important: the global positioning
system. As you probably know, the Global Positioning System (GPS) is a network of satellites
in medium earth orbit that transmit extremely precise microwave signals that can be used by a
receiver to determine location, velocity, and timing. Each GPS satellite repeatedly transmits a
v
v [m/s] c
γ 1/γ
0 0 0 ∞
2 0
1 .0 5 3 × 106 0.01 1.00005 0.99995
3 × 107 0.1 1.005 0.995
1 5 6 × 107 0.2 1.02 0.980
1.5 × 108 0.5 1.16 0.866
1 0 2.25 × 108 0.75 1.51 0.661
γ
1 .0 0
0 .0 0 .1 0 .2 0 .3 2.7 × 108 0.9 2.29 0.436
5 2.85 × 108 0.95 3.20 0.312
2.97 × 108 0.99 7.09 0.141
0 2.983 × 108 0.995 10.0 0.0999
0 .0 0 0 .2 5 0 .5 0 0 .7 5 1 .0 0
2.995 × 108 0.999 22.4 0.0447
v / c 2.996 × 108 0.9995 31.6 0.0316
2.998 × 108 0.9999 70.7 0.0141
c 1 ∞ 0
Figure 2.12: The Lorentz factor γ and its inverse for various velocities in table and graph form. The inset to the graph shows
an expanded view for low velocities.
message containing the current time, as measured by an onboard atomic clock, as well as other
parameters necessary to calculate its exact position. Since the microwave signals from the satellites
travel at the speed of light (microwaves are just a form of light, Sect. 9.5), knowing time difference
between the moment the message was sent and the moment it was received allows an observer to
determine their distance from the satellite. A ground-based receiver collects the signals from at
least four distinct GPS satellites and uses them to determine its four space and time coordinates -
(x, y, z and t).
How does relativity come into play? The 31 GPS satellites currently in orbit are in a medium
earth orbit at an altitude of approximately 20, 200 km, which give them a velocity relative to the
earth’s surface of 3870 m/s. 5iv This means that the actual atomic clocks responsible for GPS timing
on the satellites are moving at nearly 4000 m/s relative to the receivers on the ground calculating
position. Therefore, based on our discussion above, we would expect that the satellite-based GPS
clocks would measure longer time intervals that those on the earth – the GPS clocks should run
slow, a problem for a system whose entire principle is based on precise timing.
How big is this effect? We already know enough to calculate the timing difference. Let us assume
that (somehow) at t = 0 we manage to synchronize a GPS clock with a ground-based one. From
that moment, we will measure the elapsed time as measured by both clocks until the earth-based
clock reads exactly 24 hours. We will call the earth-based clock’s reference frame O, and that on
the GPS satellite O0 , and label the time intervals correspondingly. Since are on the ground in
iv
You may remember from studying pgravitation the the orbital speed can be found from Newton’s general law of
gravitation and centripetal force, v = GM/r, where G is the universal gravitational constant, M is the mass of the
earth, and r is the radius of the orbit, as measured from the earth’s center.
the earth’s reference frame, obviously we consider the earth-based clock to be the stationary one,
measuring the proper time, and the GPS clock is moving relative to us. Applying Eq. 2.13, the
elapsed time measured by the GPS clock and an earth-bound clock are related by a factor γ:
The difference between the two clocks is then straightforward to calculate, given the relative
velocity of the satellite of v = 3870 m/s ≈ 1.3 × 10−5 c:
A grand total of about 7 µs slow over an entire day (about 0.3 µs per hour), only about 80 parts
per trillion (8 × 10−11 ) per day! This may not seem like a lot, until one again considers that the
GPS signals are traveling at the speed of light, and even a small error in timing can translate into
a relatively large error in position. Remember, it is the travel time of light signals that determines
distance in GPS. If time dilation were not accounted for, a receiver using that signal to determine
distance would have an error given by the time difference multiplied by the speed of light. If we
presume that, conservatively, position measurements are taken only once per hour:
In the end, GPS must be far more accurate than this, and the effects of special relativity and
time dilation must be accounted for, along with those of general relativity 5 (Sect. 2.5). Both
effects together amount to a discrepancy of about +38 µs per day. Since the orbital velocity of the
satellites is well-known and essentially constant, the solution is simple: the frequency standards for
the atomic clocks on the satellites are precisely adjusted to run slower and make up the difference.
Though time dilation seems a rather ridiculous notion at first, it has real-world consequences we
are familiar with, if unknowingly so.
Using the same analysis as above, your clock would differ by about 6 × 10−9 s (6 ns) after
five hours, and still only 10 µs after one year. Definitely not enough to notice, but enough
to measure - current atomic clocks are accurate to ∼ 10−10 s/day (∼ 0.1 ns/day). In fact,
in 1971 physicists performed precisely this sort of experiment to test the predictions of
time dilation in relativity, and found excellent agreement. 6
Now that we have a realistic calculation under our belt, let us consider a more extreme example.
We will take identical twins, Joe and Moe, and send Moe on a rocket into deep space while Joe
stays home. At the start of Moe’s trip, both are 25 years old. Moe boards his rocket, and travels
at v = 0.95c to a distant star, and back again at the same speed. According to Joe’s clock on earth,
this trip takes 40 years, and Joe is 65 years old when Moe returns. Moe, on the other hand, has
experienced time dilation, since relative to the earth’s reference frame and Joe’s clock he has been
moving at 0.95c. Moe’s clock, therefore, runs more slowly, registers a smaller delay:
∆tJoe = 40 yr (2.26)
∆t0Moe = γ · 40 yr (2.27)
40 yr
= q 2 (2.28)
1 − 0.95c
c
≈ 12.5 yr (2.29)
It would seem, then that while Joe is 65 years old when Moe returns, having aged 40 years, Moe
is 37.5 years old, having aged only 12.5 years! On the other hand, one of the principles of relativity
is that there is no preferred frame of reference, it should be equally valid to use the clock on Moe’s
rocket ship as the proper time. From Moe’s point of view, the earth is moving away from him at
0.95c. In his reference frame, Joe’s earth-bound clock should run slow, and Moe should be older
than Joe!
This is the so-called Twin ‘Paradox’ of special relativity. In fact, it is not a paradox, but a
misapplication of the notion of time dilation. The principles of special relativity we have been
discussing are only valid for non-accelerating reference frames. In order for Moe to move from the
earth’s reference frame to the moving reference frame of the rocket ship at 0.95c and back again,
he had to have accelerated during the initial and final portions of the trip, plus at the very least to
turn around. The reference frame of the earth is for all intents and purposes not accelerating, but
the reference frame on the ship is, and our calculation of the time dilation factor is not complete.
While the earth-bound clocks to run slow from the ship’s point of view so long as the velocity of
the spaceship is constant, during the accelerated portions of the trip the earth-bound clocks actually
run fast and gain time compared to the rocket’s clocks. An analysis including accelerated motion is
beyond the scope of this text, but the gains of the earth-bound clock during the accelerated portion
of the trip more than make up for the losses during the constant velocity portion of the trip, and
no matter who keeps track, Joe will actually be younger than Moe from any reference frame. In
short, there is no ‘paradox’ so long as the notions of relativity are applied carefully within their
limits.
If the passage of time itself is altered by relative motion, what else must also be different? If the
elapsed time interval depends on the relative motion of the clock and observer, then at constant
velocity one would also begin to suspect that distance measurements must also be affected. After
all, so far we have mostly talked about time in terms of objects or pulses of light traversing specific
distances at constant velocity. Naturally, in order to explore this idea we need another thought
experiment. Once again, it needs to involve a spaceship.
This time, the experiment is simple: a spaceship departs from earth toward a distant star,
Fig. 2.13. In accordance with our discussion above, we stipulate that we only consider the portion
of the ship’s journey where it is traveling at constant velocity, and there is no acceleration to worry
about. According to observations on the earth, the star is a distance L away, and the spaceship is
traveling at a velocity v. From the earth’s reference frame O, the amount of time the trip should
take ∆tE is easy to calculate:
L
∆tE = (2.30)
v
Fair enough. On the spaceship, however, the passage of time is slowed by a factor γ due to
time dilation, and from their point of view, the trip takes less time. Since our spaceship is not
accelerating in this example (it doesn’t even have to turn around), we can readily apply Eq. 2.13.
O’ y! v
x!
Earth
O y
L
x
Figure 2.13: Length contraction and travel to a distant star. A spaceship (frame O0 ) sets out from earth (frame O) at a
velocity v toward a distant star. Do the observers in the spaceship and the earth-bound observers agree on the distance to the
star?
From the spaceship occupant’s point of view, the earth is moving relative to them, so the time
interval should be divided by γ to reflect their shorter elapsed time interval.
∆tE
∆t0ship = (2.31)
γ
Keep in mind, by clock, we mean the passage of time itself, this includes biological processes.
We already know what ∆tE must be from Eq. 2.30, so we can plug that in to Eq. 2.31 above:
L
∆t0ship = (2.32)
vγ
Do I divide or multiply by γ?
The Lorentz factor γ is always greater or equal to 1, γ ≥ 1. If you are unsure about
whether to divide or multiply by γ, think qualitatively about which quantity should be
larger or smaller. In the example above, Eq. 2.31, we know the spaceship’s time interval
should be larger than that measured on earth, so we know we have to divide the earth’s
time interval by γ.
If the occupants of the ship also measure their velocity relative to the earth (we will pretend they
even communicate with earth to make sure all observers agree on the relative velocity, v 0 = v), then
they will presume that upon arrival at the distant star, the distance covered must be their velocity
times their measured time interval. From the ship occupant’s point of view, then, the distance to
v∆tE L
L0 = v∆t0ship = = 6= L (2.33)
γ γ
If you ask the people on the ship, the distance to the star is shorter, because their apparent time
interval is! As we might have guessed, the relativity of time measurement also manifests itself in
measurements of length, a phenomena known as length contraction or Lorentz contraction.
Length Contraction The length of an object (or the distance to an object) as measured
by an observer in motion is shorter than that measured by an observer at rest by a factor
1/γ. The proper length, Lp , is measured at rest with respect to the object.
Lstationary Lp
L0moving = = (2.34)
γ γ
That is, objects and distances appear shorter by 1/γ if you are moving relative to them.
This analysis isn’t just for distances, but any spatial dimension in the direction of motion. The
length of an object is measured to be shorter when it is moving relative to the observer than when
it is at rest - objects and distances appear shorter if you are moving relative to them. For example,
a baseball moving past you at very high velocity would be shortened only along one axis parallel
to the direction of motion, and would appear as an ellipsoid, not as a smaller sphere. It would be
“squashed” along the direction of the baseball’s motion only, as shown in Fig. 2.14. The length
contraction appears only along the direction in which there is relative motion. In this case, that
means the sphere looks contracted only in one direction, so it is squashed instead of just smaller.
Just like time dilation, the length contraction effect is negligibly small at everyday velocities.
Unlike time dilation, there is as yet no everyday application of time dilation, and no simple and
straightforward experimental proof. We have no practical way of measuring the length of an object
at extremely high velocities with sufficient precision at the moment. Collisions of elementary
particles at very high velocities in particle accelerators provides some strong but indirect evidence for
length contraction, and in some sense, since length contraction follows directly from time dilation,
the experimental verifications of time dilation all but verify length contraction.
A summary of sorts:
1. objects and distances in relative motion appear shorter by 1/γ
2. the length contraction is only along the direction of motion
3. the objects to not actually get shorter in their own reference frame, it is only
apparent to the moving observer
Now that we have a good grasp of time dilation and length contraction, we can start to answer the
more general question of how we translate between time and position of events seen by observers in
different reference frames. For example, consider Fig. 2.15. A girl in frame O is stationary relative to
a star at point P , known to be a distance x away, which suddenly undergoes a supernova explosion.
At precisely this instant, a boy travels past her in on a skateboard at constant relative velocity v
along the x axis (frame O0 ). For convenience, we will assume that at the moment the explosion
occurs, he is exactly the same distance away as the girl. When and where does the supernova occur,
according to their own observations? How can we relate the distances and times measured by the
girl to that measured by the boy, and vice versa?
y ! O’
x!
v
Figure 2.15: A stationary and moving observer
watch a supernova explosion. A girl in frame O
P is stationary relative to the supernova, a distance
x away. A boy on a skateboard in the O0 frame is
traveling at v relative to frame O. How long does it
y O take before the first light of the supernova reaches
each of them?
All we need to do is apply what we know of relativity thusfar, and compare what each observer
would measure in their own frame with what the other would measure. In the girl’s case, the
situation is fairly straightforward. She is a distance x from the star, and the first light from the
explosion travels that distance at a velocity c. Therefore, according to her observations, the first
x
tarrival = (2.35)
c
What about the boy on the skateboard, in frame O? Since he is moving relative to the star,
the distance to the star appears length contracted from his point of view. At the instant of the
supernova, he measures a distance shorter by a factor γ compared to that measured by the girl.
Furthermore, from his point of view in his own reference frame, he is siting still, and the supernova
is moving toward him at velocity v. Therefore, from his point of view the supernova is getting
closer to him. After t0 seconds by his clock, the supernova is a distance vt0 closer. Putting these
two bits together, the distance x0 the boy would measure to the supernova is:
x
x0 = − vt0 (2.36)
γ
So the distance to the supernova he claims is the original distance, length contracted due to his
motion relative to the supernova, minus the rate at which he gets closer to the supernova.
What would the girl say about all this? The distance between the boy and the supernova, from
her point of view, would have to be contracted to x0 /γ since the boy is in motion relative to her.
Additionally, from her point of view, since the boy is moving away from her at v, the distance
between the two is increasing by vt after t seconds. We can express her perceived distance to the
supernova as the sum of two distances: the distance from her to the boy, and the distance from the
boy to the supernova:
x0
x = vt + (2.37)
γ
Now we have consistent expressions relating the distance measured by one observer to that
measured by the other. If we rearrange Eqs. 2.36 and 2.37 a bit, and put primed quantities on
one side and unprimed on the other, we arrive the transformations between positions measured by
moving observers in their usual form:
x0 = γ (x − vt) (2.38)
x = γ x0 + vt0
(2.39)
Here (x, t) is the position and time of an event as measured by an observer in O stationary
to it. A second observer in O0 , moving at velocity v, measures the same event to be at
position and time (x0 , t0 ).
These equations include the effects of length contraction and time dilation we have already
developed, as well as including the relative motion between the observers. If we use Eqs. 2.36 and
2.37 together, we can also arrive at a more direct expression to transform the measurement times
as well. To start, we’ll take Eq. 2.38 as written, and substitute it into Eq. 2.39:
x = γ x0 + vt0
(2.40)
= γ γ (x − vt) + vt0
(2.41)
= γ 2 x − γ 2 vt + γvt0 (2.42)
So far its a bit messy, but it will get better. Now let’s solve that for t0 . A handy relationship we
will make use of is 1 − γ 2 /γ 2 = −v 2 /c2 , which you should verify for yourself.
γvt0 = 1 − γ 2 x + γ 2 vt
(2.43)
1 − γ2 x
0
=⇒ t = γt + (2.44)
γv
1 − γ2 x
=γ t+ (2.45)
γ2 v
h vx i
=γ t− 2 (2.46)
c
And there we have it, the transformation between time measured in different reference frames.
A similar procedure gives us the reverse transformation for t in terms of x0 and t0 .
Here (x, t) is the position and time of an event as measured by an observer in O stationary
to it. A second observer in O0 , moving at velocity v, measures the same event to be at
position and time (x0 , t0 ).
The first term in this equation is just the time it takes light to travel across the distance x from
point P , corrected for the effects of time dilation we now expect. The second term is new, and it
represents an additional offset between the clock on the ground and the one in the car, not just
one running slower than the other. What it means is that events seen by the girl in frame O do
not happen at the same time as viewed by the boy in O0 !
This is perhaps more clear to see if we make two different measurements, and try to find the
elapsed time between two events. If our girl in frame O sees one even take place at position x1
and time t1 , labeled as (x1 , t1 ), and a second event at x2 and t2 , labeled as (x2 , t2 ), then she would
say that the two events were spatially separated by ∆x = x1 − x2 , and the time interval between
them was ∆t = t1 −t2 . If we follow the transformation to find the corresponding times that the boy
observes, t01 and t02 , we can also calculate the boy’s perceived time interval between the events, ∆t0 :
If observer in O stationary relative to the events (x1 , t1 ) and (x2 , t2 ) measures a time
difference between them of ∆t = t1 −t2 and a spatial separation ∆x = x1 − x2 , an observer
in O0 measures a time interval for the same events ∆t0 . Events simultaneous in one frame
(∆t = 0) are only simultaneous in the other (∆t0 = 0) when there is no spatial separation
between the two events (∆x = 0).
For two events to be simultaneous, there has to be no time delay between them. For the girl
to say the events are simultaneous requires that she measure ∆t = 0, while for the boy to say the
same requires ∆t0 = 0. We cannot satisfy both of these conditions based on Eq. 2.49 unless there
is no relative velocity between observers (v = 0), or the events being measured are not spatially
separated (∆x = 0). This means two observers in relative will only find the same events simultaneous
if the events are not spatially separated! Put simply, events are only simultaneous in both
reference frames if they happen at the same spot. At a given velocity, the larger the
separation between the two events, the greater the degree of non-simultaneity. Similarly, for a
given separation, the larger the velocity, the greater the discrepancy between the two frames. This
is sometimes called failure if simultaneity at a distance.
In the end, this is our general formula for time dilation, including events which are spatially
separated. If we plough still deeper into the consequences of special relativity and simultaneity, we
will find that our principles of relativity have indeed preserved causality - cause always precedes
effect - it is just that what one means by “precede” depends on which observer you ask. What
relativity says is that cause must precede its effect according to all observers in inertial frames,
which equivalently prevents both faster than light travel or communication and influencing the
past.
We are now ready to make a summary of the relativistic transformations of time and space. Let us
consider two reference frames, O and O0 , moving at a constant velocity v relative to one another.
For simplicity, we will consider the motion to be along the x and x0 axes in each reference frame, so
the problem is still one-dimensional. The observer in frame O measures an event to occur at time t
and position (x, y, z). The event is at rest with respect to the O frame. Meanwhile, the observer in
frame O0 measures the same event to take place at time t0 and position (x0 , y 0 , z 0 ). Based on what
we have learned so far, we can write down the general relations between space and time coordinates
in each frame, known as the Lorentz transformations:
x0 = γ (x − vt) or x = γ x0 + vt0
(2.50)
0
y =y (2.51)
z0 = z (2.52)
vx0
vx
t0 = γ t − or t = γ t0 + 2 (2.53)
c2 c
Here we have provided both the ‘forward’ and ‘reverse’ forms of the transformations for conve-
nience. Again, the distance is only contracted along the direction of motion, the x and x0 directions
– the y and z coordinates are thus unaffected. When the velocity is small compared to c (v c),
the first equation gives us our normal Newtonian result, the position in one frame relative to the
other is just offset by their relative velocity times the time interval, and the time is the same. These
compact equations encompass all we know of relativity so far - length contraction, time dilation,
and lack of simultaneity.
O x va
throw a ball off of the car at a velocity vb relative
to the ground. How do we relate the velocities as
measured in the different reference frames?
per unit time. If we calculate the displacement and time in one reference frame, then transform
both to the other reference frame, we can divide them to correctly find velocity.
Let’s start with the velocity of the ball as measured by the observer on the cart, vb0 . The
displacement of the ball relative to the cart at some time t0 after it was thrown, also measured in
the cart’s frame O0 , is just x0b = vb0 t0 . This is just how far ahead of the car the ball is after some time
t0 . We can substitute this into Eq. 2.50 to find out what displacement the observer on the ground
in O should measure, remembering that va is the relative velocity of the observers:
xb = γ x0b + va t0 = γ vb0 t0 + va t0
(2.54)
But now we have x, the displacement of the ball seen from O, in terms of t0 , the time measured
in O0 . If we want to find the velocity of the ball as measured by an observer in O, we have to divide
the distance measured in O by the time measured in O! We can’t divide one person’s position by
another person’s time, we have to transform both. So we should use Eq. 2.53 to find out what t is
from t0 too:
va x0 va vb0 t0
t = γ t0 + 2 = γ t0 + (2.55)
c c2
Now we have the displacement of the ball x and the time t as measured by the observer on the
ground in O. The velocity in O is just the ratio of x to t:
x
vb = (2.56)
t
γ (v 0 t0 + va t0 )
= b v v0 t0 (2.57)
γ t0 + ac2b
vb0 + va0
= (2.58)
1 + vac2vb
For the last step, we divided out γt0 from everything, by the way. So, this is the proper way to
compute relative velocity of the ball observed from the ground, consistent with our framework of
relativity.
va + vb0
velocity of ball observed from the ground = vb = va vb0
(2.59)
1+ c2
In the limiting case that the velocities are very small compared to c, then it is easy to see that
the expression above reduces to vb = va +vb0 – the velocity of the ball measured from the ground
is the velocity of the car relative to the ground plus the velocity of the ball relative to the car.
But, this is only true when the velocities are small compared to c.v Similarly, we could solve this
equation for vb0 instead and relate the velocity of the ball as measured from the car to the velocities
measured from the ground:
vb − va
velocity of ball observed from the cart = vb0 = (2.60)
1 − vac2vb
The equation above allows us to calculate the velocity of the ball as observed from the car if
we only had ground-based measurements. Again, for low velocities, we recover the expected result
vb0 = vb −va . What about the velocity of the cart? We don’t need to transform it, since it is already
the relative velocity between the frames O and O0 , and hence between the ground-based observer
and the car. We only need the velocity addition formula when a third party is involved.
Out of the three relevant velocities, we only ever need to know two of them.
So this is it. This simple formula is all that is needed to properly add velocities and obey the
principles of relativity we have put forward. Below, we put this in a slightly more general formula.
v
Or, more precisely, when the product of the velocities is small compared to c2 .
Again, vobj is the object’s velocity as measured from the O reference frame, and vobj is
its velocity as measured from the O0 reference frame.
Remember, c isn’t just the speed of light, it is a limiting speed for everything!
Just to be clear, let us make our previous example more concrete. Let’s say we have Joe in reference
frame O, sitting on the ground, while Moe is in a car (frame O0 ) moving at vcar = 43 c. Moe throws
a ball very hard out of the car window, such that he measures its velocity to be vball 0 = 12 c in his
reference frame. What would Joe say that the velocity of the ball is, relative to his reference frame
on the ground?
Basically, Joe wants to know the velocity of the ball relative to the ground, not relative to the
car. What we need to do is relativistically combine the velocity of the car relative to the ground
and the velocity of the ball relative to the car. Classically, we would just add them together:
0 3 1 5
vball = vcar + vball = c + c = c = 1.25c WRONG!
4 2 4
Clearly this is an absurdity - the ball cannot be traveling faster than the speed of light in anyone’s
reference frame. We need to use the proper relativistic velocity addition formula, Eq. 2.61. We
know the velocity of the ball relative to the car in frame O0 , vball
0 and the velocity of the car relative
to the ground in the O frame, vcar , so we just substitute and simplify:
0
vcar + vball
vball = 0
vcar vball
(2.62)
1+ c2
3 1
4c + 2c
= (2.63)
( 3 c)( 1 c)
1 + 4 c2 2
5
4c 10
= 3 = 11 c ≈ 0.91c (2.64)
1 + 8c
So, in relativity, three quarters plus one half is only about 0.9! But this is the result we are
looking for - no matter what velocities v < c we add together, we always get an answer less than
c. Put another way, no matter what reference frame we consider, the velocity of an object will
always observed to be less than c. So our relativistic velocity addition works so far. But what
about applying it to light, which is actually traveling right at c. Does everything still come out ok?
What if, instead of throwing a ball out of the window, Moe uses a flashlight to send out a light
pulse? In that case, we have to find that the velocity of light is c no matter which frame we use.
Remember our problem in Sect. 2.2.3? We had Joe traveling on a rocket at 0.99c, while Moe on the
ground shines a flashlight parallel to his path, shown again in Fig: 2.17. Our claim at the time was
that both Moe and Joe should measure the same speed of light. Does our new velocity addition
formula work for this case?
O’ y!
x!
|!v | = 0.9c
In this case, Joe is on a rocket (frame O0 ) moving at vrocket = 0.99c relative to Moe on the
ground. Moe knows that in his frame O, the light from the flashlight travels away from him at
velocity vlight = c. What is the velocity of light observed by Joe in the rocket, vlight0 , if we use the
velocity addition formula? All we have to do is subtract the speed of light as measured by Moe
from Joe’s speed on the rocket ship, according to the second equation in 2.61:
0 vlight − vrocket
vlight = v vlight (2.65)
1 − rocket
c2
c − 0.99c
= (2.66)
1 − (0.99c)(c)
c2
0.01c
= =c (2.67)
1 − 0.99
Lo and behold, the thing works! Our velocity addition formula correctly calculates that both
Joe and Moe have to measure the same speed of light, since the speed of light is the same when
observed from any reference frame. We shouldn’t be too surprised, however: the velocity addition
formula was constructed to behave in exactly this way. How about if Joe holds the flashlight while
in the rocket, what is the speed of light as measured by Moe on the ground? Now we have to add
the velocities of the light coming out of the rocket and the velocity of the rocket itself, according
the first equation in 2.61. Still no problem:
0
vrocket + vlight
vlight = vrocket v 0
(2.68)
light
1+ c2
0.99c + c
= (2.69)
1 + (0.99c)(c)
c2
1.99c
= =c (2.70)
1 + 0.99
In the end, we have succeeded in constructing a framework of mechanics that keeps the speed of
light invariant in all reference frames, and answers (nearly) all the questions raised at the beginning
of the chapter.
notions of relative velocity need to be altered, then the next thing must surely be momentum and
kinetic energy. As it turns out, even our concept of mass needs to be tweaked a bit.
Relativistic momentum:
~
p = γm~
v (2.71)
Here ~
p is the momentum vector for an object of mass m moving with velocity ~
v.
The derivation is a bit beyond the scope of our discussion, but defining momentum in this way
makes it independent of the choice of reference frame, and restores conservation of momentum as a
fundamental physical law. For low velocities (v c), γ ≈ 1, and this reduces to the familiar result.
For velocities approaching c, the momentum grows much more quickly than we would expect.
In fact, an object traveling at c would require infinite momentum (and therefore infinite kinetic
energy), clearly an absurdity. This is one good reason why nothing with finite mass can ever travel
at the speed of light! Only light itself, with no mass, can travel at the speed of light.
however, the result is not so simple. If a composite object contains multiple, independently moving
bodies (such the individual atoms making up matter, for instance), the individual entities may
interact among themselves and move about, and the object possesses internal energy Ei as well as
the kinetic energy due to the motion of the whole mass. Overall, classically the kinetic energy of
such a body is the sum of these two energies – the energy due to the motion of the object as a
whole, and the energy due to the motion of the constituents of the object, KE = 12 mv 2 + Ei . Any
moving body more complex than a single point mass has a contribution due to its internal energy.
In relativity, the kinetic energy does still depend on the motion of a body as a whole as well as
its internal energy content. As with momentum, conservation of energy requires that the energy of
a body is independent of the choice of reference frame, the total energy of a body cannot depend on
the frame in which it is measured. The total energy – kinetic plus internal – must be the same in
all reference frames. A derivation requires somewhat more math than we would like, but the result
is simple:
E = γmc2 (2.72)
This equation already tells us that the energy content of a body grows rapidly as v approaches
c, and reaching the speed of light would require a body to have infinite energy. What is more
interesting, however, is when the velocity of the body is zero, i.e., γ = 1. In this case, E = mc2 - the
body has finite energy even when not in motion! This is Einstein’s most famous equation, and it
represents the fundamental equivalence of mass and energy. Any object has an intrinsic, internal
energy associated with it by virtue of having mass. This constant energy is called the rest energy:
Rest Energy:
ER = mc2 (2.73)
As Einstein himself put it, “Mass and energy are therefore essentially alike; they are only
different expressions for the same thing.” 7 Matter is basically an extremely dense form of energy
– is convertible into energy, and vice versa. In fact, the rest energy content of matter is enormous,
owing to the enormity of c2 - one gram of normal matter corresponds to about 9 × 1013 J, the same
energy content as 21 ktons of TNT! It is the conversion of matter to energy that is responsible for
the enormous energy output of nuclear reactions, such as those that power the sun, a subject we
will return to.
The equivalence of matter and energy, or, if you like, the presence of an internal energy due
solely to a body’s matter content, is an unexpected consequence of relativity. But we still have not
determined the actual kinetic energy of a relativistic object! Again, the derivation is somewhat
laborious, but the result is easy enough to understand. If we take the total energy of an object,
Eq. 2.72, and subtract off the velocity-independent rest energy, Eq. 2.73, what we are left with
is the part of a body’s energy that depends solely on velocity. This is the kinetic energy we are
looking for, and it means the total energy of a body is the sum of its rest and kinetic energies:
KE = (γ − 1) mc2 (2.74)
Total energy:
Etotal = KE + ER (2.75)
Since γ = 1 when v = 0, the kinetic energy of a stationary body is zero, as we expect. At low
velocities (v c), one can show that this expression correctly reduces to 12 mv 2 . As with the total
energy, for a body to actually acquire a velocity of c it would need an infinite kinetic energy, again,
a primary reason why no object with mass can travel at the speed of light.
For completion, we should note that it is still possible to relate relativistic energy and momentum,
just like it was possible to relate classical kinetic energy and momentum, though we will not derive
the expressions here:
here p is the momentum of a body, m its mass, v its velocity, E its energy, and c is the
speed of light. We can use this to write the relativistic kinetic energy and momentum
equations in a different form:
r
p E2
KE = p2 c2 + m2 c4 − mc2 and p= − m2 c4 (2.78)
c2
The energy content of a body still scales with its momentum, and for a body at rest (p = 0), the
energy content is purely the rest energy mc2 . Once again we have an unexpected result, however:
objects with no mass must also have momentum, so long as they have energy. For massless particles
– such as the photons that make up a beam of light – we have the result E = pc, or p = E/c. This
is truly another odd result of relativity, completely unexpected from classical physics! How can
objects with no mass still have momentum? Since matter and energy are equivalent according to
relativity, having energy is just as good as having mass, and still leads to a net momentum. This
will become an important consideration when we begin to study optics and modern physics.
E
p= (2.79)
c
If you combine Eqs. 2.77 and 2.79, you come to an even wilder conclusion. If the particle has
zero mass, but some energy greater than zero, then we can write
E 2
pc2 c
v= = c =c (2.80)
E E
A particle with zero mass always moves at the speed of light, and can never stop moving! It
doesn’t matter what the energy of the particle is, anything with finite energy but zero mass has to
travel at the speed of light. The converse is true as well – anything moving at the speed of light
must be massless. Just to drive the point home one last time: the speed of light is an upper limit
to physically attainable speeds for material bodies.
1. An astronaut traveling at v = 0.80c taps her foot 3.0 times per second. What is the frequency
of taps determined by an observer on earth? (Hint: be careful about the difference between time
and frequency!)
5.0 taps/sec
6.7 taps/sec
1.8 taps/sec
3.0 taps/sec
2. A spaceship moves away from earth at high speed. How do experimenters on earth measure
a clock in the spaceship to be running? How do those in the spaceship measure a clock on earth
to be running?
slow; fast
slow; slow
fast; slow
fast; fast
3. If you are moving in a spaceship at high speed relative to the earth, would you notice a
difference in your pulse rate? In the pulse rate of the people back on earth?
no; yes
no; no
yes; no
yes; yes
4. The period of a pendulum is measured to be 3.00 in its own reference frame. What is the
period as measured by an observer moving at a speed of 0.950c with respect to the pendulum?
6.00 sec
13.4 sec
0.938 sec
9.61 sec
5. The Stanford Linear Accelerator (SLAC) can accelerate electrons to velocities very close to
the speed of light (up to about 0.99999999995c or so). If an electron travels the 3 km length of
the accelerator at v = 0.999c, how long is the accelerator from the electron’s reference frame?
134 m
67.1 km
94.9 m
300 m
6. A spacecraft with the shape of a sphere of diameter D moves past an observer on Earth with
a speed 0.5 c. What shape does the observer measure for the spacecraft as it moves past?
streak
ellipsoid
sphere
cube
7. Suppose you’re an astronaut being paid according to the time you spend traveling in space.
You take a long voyage traveling at a speed near that of light. Upon your return to earth,
you’re asked how you would like to be paid: according to the time elapsed by a clock on earth,
or according to the ship’s clock. Which do you choose to maximize your paycheck?
The earth clock.
The ship’s clock.
It doesn’t matter.
2.7 Problems
1. In the 1996 movie Eraser, 8 a corrupt business Cyrez is manufacturing a handheld rail gun
which fires aluminum bullets at nearly the speed of light. Let us be optimistic and assume the
actual velocity is 0.75 c. We will also assume that the bullets are tiny, about the mass of a paper
clip, or m = 5 × 10−4 kg.
2. Show that the kinetic energy of a (non-relativistic) particle can be written as KE = p2 /2m,
where p is the momentum of a particle of mass m.
3. A pion at rest (mπ = 273 me− ) decays to a muon (mµ = 207 me− ) and an antineutrino
(mν ≈ 0). This reaction is written as π − → µ− + ν. Find the kinetic energy of the muon and
the energy of the antineutrino in electron volts. Hint: relativistic momentum is conserved.
5. The average lifetime of a pi (π) meson in its own frame of reference (i.e., the proper lifetime)
is 2.6 × 10−8 s
(a) If the meson moves at v = 0.98c, what is its mean lifetime as measured by an observer on
earth?
(b) What is the average distance it travels before decay, measured by an observer on Earth?
(c) What distance would it travel if time dilation did not occur?
6. You are packing for a trip to another star, and on your journey you will travel at 0.99c. Can
you sleep in a smaller cabin than usual, because you will be shorter when you lie down? Explain
your answer.
7. A deep-space probe moves away from Earth with a speed of 0.88c. An antenna on the probe
requires 4.0 s, in probe time, to rotate through 1.0 rev. How much time is required for 1.0 rev
according to an observer on Earth?
8. A friend in a spaceship travels past you at a high speed. He tells you that his ship is 24 m
long and that the identical ship you are sitting in is 18 m long.
9. A Klingon space ship moves away from Earth at a speed of 0.700c. The starship Enterprise
pursues at a speed of 0.900c relative to Earth. Observers on Earth see the Enterprise overtaking
the Klingon ship at a relative speed of 0.200c. With what speed is the Enterprise overtaking
the Klingon ship as seen by the crew of the Enterprise?
10. An observer sees two particles traveling in opposite directions, each with a speed of 0.99000c.
What is the speed of one particle with respect to the other?
1. 1.8 taps/sec. The ‘proper time’ ∆tp is that measured by the astronaut herself, which is 1/3
of a second between taps (so that there are 3 taps per second). The time interval between taps
measured on earth is dilated (longer), so there are less taps per second. For the astronaut:
1s
∆tp =
3 taps
0 1 1s 1 1s 0.56 s 1s
∆t = γ∆tp = q · =√ · ≈ =
1− 0.82 c2 3 taps 1 − 0.82 3 taps tap 1.8 taps
c2
2. slow; slow. The time-dilation effect is symmetric, so observers in each frame measure a
clock in the other to be running slow. Put another way, the relative velocity of the earth and
the ship is the same no matter who you ask – each says the other is moving with some speed v,
and they are sitting still. Therefore, the dilation effect is the same in both cases.
3. no; yes. There is no relative speed between you and your own pulse, since you are in the
same reference frame, so there is no difference in your pulse rate (possible space-travel-related
anxieties aside). There is a relative velocity between you and the people back on earth, however,
so you would find their pulse rate slower than normal. Similarly, they would find your pulse
rate slower than normal, since you are moving relative to them. Relativistic effects are always
attributed to the other party – you are always at rest in your own reference frame.
4. 9.61 sec. The proper time is that measured by in the reference frame of the pendulum itself,
∆tp = 3.00 sec. The moving observer has to observe a longer period for the pendulum, since from
the observer’s point of view, the pendulum is moving relative to it. Observers always perceive
clocks moving relative to them as running slow. The factor between the two times is just γ:
5. 134 m. The electron in its own reference frame sees the accelerator moving toward it at
0.999c, and sees a contracted length:
r
Lp 0.9992 c2 p
L= = 3 km · 1− 2
= 3 km · 1 − 0.9992 = 0.134 km = 134 m
γ c
6. ellipsoid. The sphere is length contracted only along its direction of motion, i.e., only along
one axis. Squishing a sphere along one axis makes an ellipsoid.
7. The earth’s clock. Less time will have passed in your reference frame, since you are moving
relative to the earth. The earth’s clock will have registered more time elapsed than yours.
1. 2.3 × 1013 J, 2.56 × 10−4 kg. First part: relativistic kinetic energy is given by:
KE = (γ − 1) mc2
1 1 1
γ=q =q =√ = 1.51
1− v2
1− (0.75c)2 1 − 0.752
c2 c2
2
mc2 = 5 × 10−4 kg 3 × 108 m/s = 4.5 × 1013 kg · m2 /22 = 4.5 × 1013 J
Second part: what rest mass is equivalent to this amount of kinetic energy? We just need to
use the mass-energy equivalence formula:
= mc2 = KE
ER
KE (γ − 1) mc2
=⇒ m = =
c2 c2
= (γ − 1) m = 0.51m
= 2.56 × 10−4 kg
In other words, it takes fully half the mass of the bullet itself, completely converted to pure
energy, to fire one round. Using more conventional propellants, that would mean 5760 kg (∼
6 tons) of TNT per round.
1 mv · v mv · v m mv · mv p·p p2
KE = mv 2 = = = = =
2 2 2 m 2m 2m 2m
Or, since you know the answer you want ...
2
p2 (mv) m2 v 2 mv 2 1
= = = = mv 2
2m 2m 2m 2 2
3. 4.08 MeV for the muon, 29.6 MeV for the antineutrino. This one is a bit length-
ier than most of the others! Before the collision, we have only the pion, and since it is at
rest, it has zero momentum and zero kinetic energy. After it decays, we have a muon and an
antineutrino created and speed off in opposite directions (to conserve momentum). Both total
energy - including rest energy - and momentum must be conserved before and after the collision.
First, conservation of momentum. Before the decay, since the pion is at rest, we have zero
momentum. Therefore, afterward, the muon and antineutrino must have equal and opposite
momenta. This means we can essentially treat this as a one-dimensional problem, and not
bother with vectors. A consolation prize of sorts.
For the last step, we made use of the fact that relativistic momentum is p = γmv. Now we can
also write down conservation of energy. Before the decay, we have only the rest energy of the
pion. Afterward, we have the energy of both the muon and antineutrino. The muon has both
kinetic energy and rest energy, and we can write its total kinetic energy in terms of γ and its
rest mass, E = γmc2 . The antineutrino has negligible mass, and therefore no kinetic energy, but
we can still assign it a total energy based on its momentum, E = pc.
Now we can combine these two conservation results and try to solve for the velocity of the muon:
pν
mπ = γµ mµ +
c
vµ
mπ = γµ mµ − γµ mµ
c
mπ vµ
= γµ − γµ
mµ c
mπ h vµ i
= γ 1−
mµ c
We will need to massage this quite a bit more to solve for vµ ...
v
mπ h vµ i 1 − cµ
= γ 1− =q
mµ c v2
1 − cµ2
2 v 2
1 − cµ
mπ
= v2
mµ 1 − cµ2
v 2
1 − cµ
= v v
1 − cµ 1 + cµ
v 2
1 − cµ
= v v
1 −cµ 1 + cµ
v
1 − cµ
= v
1 + cµ
Now we’re getting somewhere. Take what we have left, and solve it for vµ ... we will leave that
as an exercise to the reader, and quote only the result, using the given masses of the pion and
muon:
2
mπ
vµ 1− mµ
= 2 ≈ −0.270
c
mπ
1+ mµ
From here, we are home free. We can calculate γµ and the muon’s kinetic energy first. It is
convenient to remember that the electron mass is 511 keV/c2 .
1 1 1
γµ = q =q =√ ≈ 1.0386
1− v2
1− (0.27c) 2
1 − 0.272
c2 c2
Eν = pν c = −pµ c
= −γµ mµ vµ
= −1.0386 · 207 · 5.11 keV/c2 · (−0.270c)
4. 1.96 × 1013 m. The 15 h set on the alarm clock in the spaceship is the proper time interval,
∆tp . Since the space ship is moving away from the earth at v = 0.77c, an earthbound observer
observes a longer dilated time interval, ∆t0 . Based on this longer time interval, the earthbound
observer will measure that the space ship has covered a distance of v∆t0 . So, first: we need to
calculate γ, then the dilated time interval, then finally the distance measured by the earthbound
observer.
1 1 1
γ = q =q =√ = 1.57
1− v2
1− (0.77c)2 1 − 0.772
c2 c2
0
∆t = γ∆tp
= 1.57 · 15 h = 1.57 · 5.4 × 104 s ≈ 8.48 × 104 s
d0 = v∆t0
= 0.77c · 8.48 × 104 s = 0.77 · 3 × 108 m/s · 8.48 × 104 s
d0 ≈ 1.96 × 1013 m
5. 1.31 × 10−7 s, 38.4 m, 7.64 m The π meson’s lifetime in its own frame is the proper time
interval, ∆tp = 2.6 × 10−8 s. An earthbound observer measures a longer dilated time interval ∆t0 .
To calculate it, we need only calculate γ for the velocity given, vπ = 0.98c.
1 1 1
γ = q =q =√ = 5.03
1− v2
1− (0.98c)2 1 − 0.982
c2 c2
∆t0 = γ∆tp
= 5.03 2.6 × 10−8 s
≈ 1.31 × 10−7 s
The distance the π meson travels in the earthbound observer’s reference frame, d0 is the π
meson’s velocity multiplied by the time interval measured by the earthbound observer. We
don’t need to worry about whether the velocity is measured in the π meson’s or the observer’s
frame - since it is a relative velocity, it is the same either way.
d0 = γvπ ∆tp = vπ ∆t0 = (0.98c)· 1.31 × 10−7 s = 0.98 · 3 × 108 m/s · 1.31 × 10−7 s ≈ 38.4 m
Without time dilation, the distance traveled would just be the proper lifetime multiplied by the
meson’s velocity:
d = vπ ∆tp = (0.98c) · 2.6 × 10−8 s = 0.98 · 3 × 108 m/s · 2.6 × 10−8 s ≈ 7.64 m
6. No. There is no relative speed between you and your cabin, since you are in the same
reference frame. You and your bed will remain at the same lengths relative to each other.
7. 8.42 s. The time interval in the probe’s reference frame is the proper one ∆tp ... which makes
sense, since the antenna is part of the probe itself! The probe and antenna are moving relative
to the earth, and therefore the earthbound observer measures a longer, dilated time interval ∆t0 :
probe = ∆tp
earth = ∆t0
∆t0 = γ∆tp
As usual, we first need to calculate γ. No problem, given the probe’s velocity of 0.88c relative
to earth:
1 1 1
γ=q =q =√ = 2.11
1− v2
1− (0.88c) 2
1 − 0.882
c2 c2
The proper time interval for one revolution ∆tp in the probe’s reference frame is 4.0 s, so we can
readily calculate the time interval observed by the earthbound observer:
8. 24 m; 18 m; 0.661c. Once again: if you are observing something in your own reference
frame, there is no length contraction or time dilation. You always observe your own ship to be
the same length. If your friend’s ship is 24 m long, and yours is identical, you will measure it to
be 24 m.
On the other hand, you are moving relative to his ship, so you would observe his ship to be
length contracted, and measure a shorter length. Your friend, on the other hand, will observe
exactly the same thing - he will see your ship contracted, by precisely the same amount. Your
observation of his ship has to be the same as his observation of his ship - since you are only the
two observers, and you both have the same relative velocity, you must observe the same length
contraction. If he sees your ship as 18 m long, then you would also see his (identical) ship as
18 m long.
Given the relationship between the contracted and proper length, we can find the relative velocity
easily. Your measurement of your own ship is the proper length Lp , while your measurement of
your friend’s ship is the contracted length L0 :
Lp = γL0
Lp 24 4
=⇒ γ = 0
= =
L 18 3
1 4
q =
1− v2 3
c2
v2 32 9
1− = =
c2 4 2 16
v2 9 7
= 1− =
c2 16 16
r √
7 7
v = c= c ≈ 0.661c
16 4
9. 0.541c. This is just a problem of relativistically adding velocities, if we can keep them all
straight. Let the unprimed system denote velocities measured relative to the earth, and the
primed system those measured relative to the enterprise. We have, then:
Since the Enterprise is moving faster relative to the earth than the Klingon ship, that means
that from the Enterprise’s point of view, the Klingons are actually moving backwards toward
them. If we plug what we know into the velocity addition formula ...
ve + vk0
vk = 0
ve vk
1+ c2
It takes a bit of algebra, but we can readily solve this for vk0 :
ve − vk
vk0 =
1 − vecv2 k
Not so surprisingly, what we have just done is to re-write the ‘velocity addition formula’ as a
‘velocity subtraction formula.’ It is just rearranging same formula (you can verify that both
equations above are equivalent ...), but the second form is far more convenient for our present
purposes.
We can find the velocity of the Klingon ship relative to the enterprise in terms of both ships’
velocities relative to the earth. In the limit that both velocities are much smaller than c, we see
that vk0 ≈ ve − vk = 0.200c, just as we would expect from normal Newtonian physics. Since in
this case, neither velocity is negligible compared to c, the actual vk0 will be significantly larger.
At this point, we can just plug in the numbers we have and see:
ve − vk
vk0 =
1 − vecv2 k
0.900c − 0.700c
= (0.900c)(0.700c)
1− c2
0.200c 0.200c
= =
1 − (0.900) (0.700) 0.37
vk0 ≈ 0.541c
So, as far as the crew of the Enterprise is concerned, they are overtaking the Klingon ship at a
rate of 0.541c.
10. 0.99995c. Let the observer be in frame O0 . In the reference frame of one of the particles,
labeled O, the observer is traveling at v = 0.99c, and the second particle is traveling at v20 = 0.99c
relative to the observer. We can then find the velocity of the second particle relative to the first,
v2 , through velocity addition:
v + v10
v2 = vv 0
(2.81)
1 + c21
0.99c + 0.99c
= (2.82)
1 + (0.99c)(0.99c)
c2
1.98c
= ≈ 0.99995c (2.83)
1 + 0.9801
This is an example of a problem where you need to make sure to use enough significant digits!
60
3
Electric Forces and Fields
E LECTRICITY has become ubiquitous in modern life, so much that we rarely think about life
without it. Though ancient Greeks first began experimenting with electricity around 700 b.c.,
it was not until the 18th and 19th centuries that we began to clearly understand electricity and
how to harness it.
In this chapter, we will discuss electric charges and the elec-
tric force, quantified through Coulomb’s law, and introduce the
electric field associated with charges. With these concepts, we
will be able to explain many of the myriad electrostatic phe-
nomena around us.
Normal objects usually contain equal amounts of positive and negative charges –
61
62 3.1 Properties of Electric Charges
they are electrically neutral. Electric forces arise only when there is an imbalance in electric charge,
and objects carry a net positive or negative charge. On the atomic scale, the carriers of positive
charge are the protons. Along with neutrons, which have no electric charge, they comprise the
nucleus of an atom (which is about 10−15 m across). Electrons are the carriers of negative charge.
In a gram of normal matter, there are about 1023 protons and an equal number of electrons, so the
net charge is zero.
Electrons are far lighter than protons, and
are more easily accelerated by forces. In ad-
Table 3.1: Properties of electrons, protons, and neutrons dition, they occupy the outer regions of atoms,
and are more easily gained or lost. Objects that
become charged to so by gaining or losing elec-
Particle Charge [C] [e] Mass [kg]
trons, not protons. Table 3.1 gives some prop-
−
electron (e ) −1.60×10 −19 −1 9.11×10 −31
erties of protons, electrons, and neutrons.
+
proton (p ) +1.60×10 −19 +1 1.67×10−27 Charge can be transferred from one mate-
neutron (n0 ) 0 0 1.67×10−27 rial to another. Many chemical reactions are,
in essence, charge transfer from one species to
another (see page 90 for some examples). Rub-
bing two materials together facilitates this process by increasing the area of contact between the
materials – e.g., rubbing a balloon on your hair. Since it is a gain or loss of electrons that give a
net charge, this means that when objects become charged, negative charge is transferred from
one object to another.
Units of charge:
The SI unit of charge is the Coulomb, [C]. One unit of charge is e = 1.6 × 10−19 [C]
Charge is never created or destroyed, only transferred from one object to another. Objects
become charged by gaining or losing electrons, transferring them to other objects. Charge
is also quantized, meaning it only comes in multiples of the fundamental unit of charge e.
An object can have a charge of ±e, ±2e, ±3e, etc, but not +0.27e or −0.71e.i Electrons have a
negative charge of one unit (−e), and protons have a positive charge of one unit (+e). The SI unit
of charge is the coulomb [C], and e has the value 1.6×10−19 C. Since e is so tiny when measured in
Coulombs, and since it is the basic fundamental unit of charge, we will sometimes simply measure
a small amount of charge in “e’s” – how many individual unit charges are present.
i
Quarks are an exception we will cover at the end of the semester.
How do materials respond to becoming charged, and how do we charge up a material in the first
place? What do we mean by “becoming charged” anyway? This will be more clear shortly, but for
now, we will presume that “charging” simply means creating an imbalance of electric charges in a
material. A net negative charge can be achieved by adding excess electrons to a material, and a
net positive charge can be created by taking away some electrons from a material.
For our purposes, materials respond to becoming charged in one of two ways: the excess charge
can move about freely and evenly distribute themselves, or the excess charge can stay localized to
the region where it was created. Conductors and insulators are the two broad classes of materials,
respectively, which fit these criteria - in conductors, excess charges move freely in response to an
electric force. All other materials are insulators, and the charges do not move!
In fact, there is nothing particularly special about the excess charge. The excess charge will move
in the material in the same way any other charges do - we can’t tell the charges apart. In other
words, conductors are materials in general where charges move freely, and insulators are materials
in general where they do not. There does not need to be excess charge for this to be true, charges
inside conductors are still in motion even if, over all, they cancel each other out.
Conductors:
1. e.g., metals – silver, gold, aluminum, steel, etc.
2. charges are mobile, and move in response to an electric force
3. large number of charges
4. charge distributes evenly over surface
Insulators:
1. e.g., glass, most ceramics, rubbers, and plastics
2. charges are immobile
3. charge deposited on insulators stays localized
Semiconductors:
1. e.g., silicon, gallium arsenide, germanium
2. in between conductors and insulators
3. charges are highly mobile ...
4. ... but the number of charges is small, depends e.g., on temperature and purity
5. conducting properties can be widely varied
Copper and aluminum are typical conductors. When conductors are charged in some small
region, the charge readily distributes itself over the entire surface of the material. Thus,
on a conductor charge is always equally distributed over its entire surface. Charge flows through a
conductor readily, and if given a chance, out of it. This is an electric current, as we will see shortly.
Glass and rubber are typical insulators. When insulators are charged (e.g., by rubbing),
only the rubbed areas are charged. There is no tendency for the charge to flow to other
regions of the material - charge deposited on insulators will stay localized to a small region.
Conduction is charging through physical contact, which moves e− from one object to another.
One example is charging a balloon by rubbing it on your hair. After doing this, the balloon easily
sticks to a wall or picks up little bits of paper, and your hair stands a bit on end. What you
have really done is transferred charges from the balloon to your hair, or vice versa. Each of your
individual hairs becomes charged the same way (either all positive or all negative, depending on
what you rubbed on your hair), and the individual strands repel each other. Their repulsion makes
them want to maximize the distance between them, which is achieved by standing on end, radiating
outward.
As another example, consider rubbing an insulating rod (e.g., rubber, hard plastic glass) against
a piece of silk. The act of rubbing these two insulating materials will physically force some charges
to move from one object to the other. When charges are transferred to the insulating rod, they do
not move – regions of localized charge are created in the rubbed regions. No charge has been
created or destroyed, we simply moved some charges from one place to another - one
object ends up with a net positive charge, the other with a net negative charge, equal
in magnitude. One can verify that both objects are charged by trying to pick up bits of paper
with them. This is also true when you rub a balloon on your hair - it is clear immediately that both
the balloon and your hair have become charged! It couldn’t be any other way, or we would have
had to create charges out of thin air.
Figure 3.2 and its accompanying box illustrates the process of charging a metallic object by
conduction. In this example, you take a rubber rod you have already charged (say, with a piece of
silk or your hair), and use that to charge a third object.
3.2.1.1 Grounding
Sounds simple enough. Why can’t we just take a piece of Copper pipe and rub it with a cloth?
You can, if you are careful . . . charges flow evenly through a conductor, and if possible, out of
the conductor entirely. Only isolated conductors can be charged, electrically contacted conductors
cannot. By ’electrically connected,’ we mean the conductor we are trying to charge cannot have any
sort of conducting path to the earth. The Earth can be considered an (essentially) infinite reservoir
for electrons, either sourcing or sinking as many charges as we need. Since charges distribute
themselves evenly over a conducting surface, if there were a path to the earth, the mobile charges
would follow it to the earth, and keep doing so until none were left on the conductor.
Given a conducting path to the earth, charges from the conductor will always keep flowing.
If charges can find a way to Earth, they will get there (e.g., through pipes, wires, or you!). Another
phrase for when you are the connection to Earth is “ground fault.” This is when you acciden-
tally make yourself the connection between a charged (or current-carrying) wire and the Earth ...
with potentially disastrous results. A so-called “ground fault interrupter” (GFI) senses when this
Can we charge without contacting it all? Yes! This is induction charging. Now we explicitly need
a ground point or reference point for this to work though. An object connected to a conducting
wire or pipe buried in the Earth is said to be grounded, the Earth itself is the ground point. As
mentioned above, the Earth can be considered an infinite reservoir for electrons, sourcing or sinking
an infinite number of charges. Using this idea, we can understand a non-contact charging process
known as induction.
Figure 3.3 illustrates the process of charging a metallic object by induction. Charging an
object by induction requires no contact with the object inducing the charge. First, we
take an isolated conducting (metal) sphere. From our discussion above, it is crucial that it not be
contacted to the ground in any way. Placing it on an insulating stand will do nicely. Next, we
bring a negatively charged rod near, but not touching, the sphere. We can prepare a negatively
charged glass rod by rubbing it with silk (charging the glass by conduction).
When the charged rod is near the conducting sphere, the negative charges on the rod will repel
the free negative charges (electrons) on the sphere, with the result that the half of the sphere
nearest the rod will have a net negative charge (Fig. 3.3b). Now, if we take a conducting wire and
connect the far end of the sphere to the ground (Fig. 3.3c), the excess negative charge on that
side, repelled by the rod, will want to flow down the wire into the earth, effectively draining away
a quantity of negative charge from the sphere. Once we have done that, the sphere now has a net
positive charge.
Removing the ground connection, Fig. 3.3d, will instantaneously leave the side of the sphere
near the rod positively charged, and the far side (nearly) uncharged, since we just drained away
the negative charges. After a very short time, the conducting sphere reaches equilibrium, and
we must have a uniform distribution of charge on the surface of the conductor. Thus, the excess
positive charge has to be evenly distributed on the surface of the sphere. We are left with a charged
conducting sphere!
charged
rubber rod + -
neutral b) + -
a) -- ----
-- --- - - --- -- - - - -- - + -
metal sphere +
+ -
- Figure 3.3: Charging a metallic object by induction. a)
A neutral metallic sphere with equal numbers of positive
and negative charges. b) The charge on a neutral metal
sphere is redistributed when a charged rod is brought near
c) +
+
d) -- ---- +
+
-- ----
-- --- - - --- -- - - - -- - + - - -- -- --- - - --- -- - - - -- - + it. c) When the sphere is then grounded, some of the
+ - +
+ + negative charges (electrons) leave it through the ground
wire. d) When the ground connection is removed, excess
positive charge is left on the sphere. e) When the charged
+
rod is removed, the excess positive charge redistributes
e) + + itself until the sphere’s surface is uniformly charged.
+ +
A process similar to charging by induction in conductors takes place in insulators (such as neutral
atoms or molecules in particular). The presence of a charged object can result in more positive
charge on one side of an insulating body than the other, by realignment of the charges within the
individual molecules. This process is known as polarization, and we will cover it in more depth
in the following chapter.
Our discussion of charging allows us to now better appreciate the distinction between conductors
and insulators. The difference in the degree of conductivity between conductors and insulators is
staggeringly enormous, a factor of 1020 . For instance, a charged Copper sphere connected to the
ground looses its charge in a millionth of a second, while an otherwise identical glass sphere can
hold its charge for years.
When you charge two objects, such as a balloon and your hair, you invariably end up observing an
attraction or repulsion between the charged objects. What is the character of this force? How does
it depend on how much the objects have been charged, how far away they are, or anything else?
If you continue to experiment with charged objects, you will find that the force due to electrically
charged objects has the following properties:
These properties led Coulomb (Fig. 3.1) to propose a neat mathematical form for the electric
force between two charges:iii
Coulomb’s Law: the force between two charges q1 and q2 , separated by a distance r12
is given by:
~ q1 q2
F = ke 2 r̂12 (3.1)
r12
where ke is the “Coulomb constant,” and r̂12 is a unit vector pointing along a line con-
necting the two charges.
Equation 3.1 is known as “Coulomb’s law”. What Coulomb’s law states is that the force
between two charged objects ~ F, depends only on how big the charges are (q1 and q2 ), and how far
apart they are (r12 ). Keep in mind that force is a vector, and the dimensionless unit vector r̂12
reminds us that the electric force is directed along a line connecting the two charges q1 and q2 .
Figure 3.4 schematically shows the electric force between two like and two unlike charges. The
distance between the charges r12 is given in the SI unit of meters, [m], and the charges q1 and
q2 are measured in the SI unit of charge, the Coulomb, [C]. The charges q1 and q2 can be either
positive or negative, which makes the resulting force ~ F repulsive when both charges have the same
sign, and attractive when they are opposite – just like we expect. The Coulomb constant ke gives
the relative strength of the electric force, just as G gives the relative strength of the gravitational
force, and has the SI value and units:
iii
See Sect. 1 for a summary of units and notation conventions
Coulomb’s constant
N · m2
ke = 8.9875 × 109 (3.2)
C2
In most calculations, ke can be safely rounded to ≈ 9×109 , which makes it a bit easier to remember.
Also, ke is much, much larger then Giv , by about twenty orders of magnitude, meaning that if we
treat Coulombs on equal footing with kilograms for a minute, the electric force is far, far stronger
than the gravitational force. A pair of 1 Coulomb charges interacting via the electric force is the
same as two masses of 1010 kilograms interacting via the gravitational force. Equivalently, one
might say gravity is just exceptionally weak, so far as fundamental forces go.
Question: Show that the units of Coulomb’s constant above yield a force in Newtons
when applied to Equation 3.1.
Note that no matter what the two charges are, Newton’s third lawv still holds, viz., ~ F21 = −~F12 .
The force on charge 1 due to charge 2 is equal and opposite the force on charge 2 due to charge
1, always. Even if one charge is a million times larger than the other, this must still be true.
Mathematically, this is easy to see from Eq. 3.1 – the force between two charges depends on the
product of the two charge values q1 q2 , which means it is totally symmetric if we swap 1 for 2 or vice
versa.
(a) (b)
F
+ + Figure 3.4: Electrical force between point charges.
(a) Two particles q1 and q2 which both have pos-
r12 q2 r12 q2
F itive charges. The force is repulsive, as it would
be for two negative charges, and directed along the
dashed line connecting the two charges. The unit
+ ^r12
- F vector r̂12 is indicated. (b) Two particles q1 and
q2 with charges of opposite sign, separated by a dis-
q1 q1
F tance r12 . The force is now attractive, as we ex-
pect.
When a number of separate charges act on a single charge, each exerts its own electric force.
These electric forces can all be computed separately, one at a time, and then added as vectors. This
is the powerful superposition principle, the same one you used with gravitation. This makes
calculating the net force from many charges a lot simpler than you might think. In fact, gravitation
and electrostatic forces have a number of similarities, with a few crucial differences, which we list
below.
One lingering question is how to relate the microscopic charge carriers, the electrons, to the
macroscopic behavior of charged objects. When we charge a glass rod and pick up bits of paper,
how many charges are we dealing with? Referring to Table 3.1, the charge on the proton (p+ ) has
a magnitude of e = 1.6 × 10−19 C, while an electron (e− ) has a charge of −e = −1.6 × 10−19 C.vii
This means it takes 1/e ≈ 6.3 × 1018 protons or electrons to make up a total charge of ±1 C – so
1 C is a seriously large amount of charge. Typical net charges in electrostatic situations (i.e., static
electricity) are of the order of 1 µC,viii which is still 1012 or so electrons – or about one electron for
every dollar of our national debt, if that helps bring the magnitude in perspective!
Question: If two charges of +1 µC are separated by 1 cm (= 10−2 m), what is the force
between them?
Answer: about 90 N, or roughly 20 lbs!
Technically, Coulomb’s law applies in this particular mathematical form only for point charges,
or spherical charge distributions (in which case r12 is the distance between the centers of the charge
distributions, see Sect. 3.8.3). Coulomb’s law covers electrostatic forces, which are what we call
forces between unmoving (stationary) charges. Really, though we only need to take care when
we have charges moving at very high velocities, or when charges accelerate. Accelerating charges
produce electromagnetic radiation – light – which we will cover in Chapter 9
vi
A conservative force is one which does no net work on a particle that travels along any closed path in an isolated
system. For any path, not just a closed one, the work done by a conservative force depends only on the initial and
final positions, not on the path taken. Gravity is conservative, friction is not, for example.
vii
The symbol e will frequently be used to represent the charge of a proton or electron.
viii
1 µC = 10−6 C, see Table 1.7 in Appendix 1
The electric field E ~ produced by a charge q at the location of a small “test” charge q0
is defined as the electric force ~
F exerted by q on q0 , divided by the test charge q0 .
~
~ = F
E or, ~ ~
F = q0 E (3.3)
q0
The SI unit for electric field is Newtons per Coulomb [N/C]. The direction of E ~ is
the direction of the force that acts on a positive test charge q0 placed in the field.
The test charge q0 is hypothetical – what would the force be on a charge q0 if we did place it
at some distance r away? We say that an electric field exists at a point if a test charge at
that point would be subject to an electric force there.
Using equations 3.1 and 3.3, we can write the magnitude of the electric fieldx due to a charge q
as:
~ = ke |q|
|E| (3.4)
r2
The direction of the electric field is the same as the direction of the electric force, since
the two are related by a scalar.
The electric field produced by a charge depends only on the magnitude of that charge
which sets up the field, and how far away from that charge you are. It does not depend
on the presence of a hypothetical test charge.
The principle of superposition also holds for electric fields, just as it did for the electric force. In
order to calculate the electric field from a group of charges, one may calculate the field from each
ix
We will find out later that light carries electric forces, in fact, and there is no need to invoke “action at a distance.”
x
When it is unambiguous, we will often write the magnitude of a vector, such as the electric field |E|, ~ as simply
~
E for convenience. Similarly, |B| becomes B, and ~ x becomes x.
charge individually, and add (as vectors) the individual fields. Symmetry is also very important.
For example, if a equal and opposite charges are placed on the x axis at x = a and x = −a, the field
at the origin is zero – the fields from the positive and negative charges cancel. On page 90 you can
find basic instructions on how to approach and solve electric field problems.
Source ~
|E| Source ~
|E|
Fluorescent lighting tube 10 Atmosphere (fair weather) 102
Balloon rubbed on hair 103 Atmosphere (under thundercloud) 104
Photocopier 105 Spark in air 106
Across a transistor gate dielectric 109 Near electron in hydrogen atom 1011
So E~ is large when the lines are close together, and small when they are far apart. Below are
some example, to give you an idea.
Figure 3.5: The electric field lines for point charges. (a) For
a positive point charge, the lines are directed radially outward.
q -q (b) For a negative point charge, the lines are directed radially
+ - inward. Note that the figures show only those field lines that lie
in the plane of the page. (c) The dark areas are small pieces
of thread suspended in oil, which align with the electric field
produced by a small charged conductor at the center.
(a) (b)
These 2-D drawings represent field lines for individual point charges. They only contain field
lines in the plane of the paper – there are equivalent field lines pointing in all directions. A positive
“test charge” placed in the field of the positive charge Fig. 3.5a field would be repelled, hence the
lines point outward. On the other hand, for the negative charge in Fig. 3.5b, a positive test charge
is attracted and the arrows point in. Note that the lines get more dense as they get closer to the
charge, indicating that the field strength is increasing – just what we expect from Equation 3.4.
Figure 3.6 shows nicely symmetric field lines for two charges of equal magnitude and opposite sign.
Here we have omitted the arrows for simplicity, by now you should know how to add them in. This
configuration is also known as an electric dipole. The number of lines beginning at the positive
charge must equal the number of lines ending at the negative charge. Close to each charge, the lines
are nearly radial, and the high density of lines between the charges indicates a large electric field
in this region. Finally, note that the lines are symmetric about a line connecting the two charges,
and to a line perpendicular to that one halfway between the charges.
A
q -q q C q
Figure 3.6: left Field lines for two equal and opposite charges, an “electric dipole.” The number of lines leaving the positive
charge equals the number terminating at the negative charge right Field lines for two positive charges of equal magnitude. Can
you rank the relative field strengths at points A, B, and C?xi
Figure 3.6 also shows the field lines for two positive charges. Again the lines are nearly radial near
the charges. The same number of lines leave each charge, since they are of the same magnitude.
Far away from either charge, the field looks nearly the same as it would for a single charge
twice as big as either lone charge. In between the charges, the field lines “bulge,” representing the
q1 q2 q1 q2
Figure 3.7: left Field lines for opposite charges of different magnitude. Which is the larger charge? right Field lines for
two charges of the same sign, but of different magnitude. Which is the larger charge?
repulsive nature of the electric force between like charges. Again, note that the lines are symmetric
about a line connecting the two charges, and to a line perpendicular to that one halfway between
the charges. The symmetries of the electric field surrounding charge distributions can be very useful
in solving electric field problems – for instance, we know without lifting a pencil that the field is
precisely zero along the vertical line halfway between the two charges.
Finally, Fig. 3.7 shows the field lines for two charges of different magnitude in two situations.
Can you tell which is the larger charge in each case? Can you tell which plot is for charges of the
same sign, and which is for charges of opposite signs?
The first property is most easily understood by thinking about what would happen if it were
not true, reductio ad absurdum. If there were fields inside a conductor, the free charges would move,
and “bunch up” at the regions of higher and lower field (depending on whether they are positive or
negative). This contradicts the very definition of a conductor – charges are supposed to be mobile,
and spread out evenly through the conductor. Even if we did create a field inside a conductor,
since the charges are mobile they would immediately start to flow to the region where the electric
field is, gathering in sufficient number until they cancelled it out. Anyway, if this happened, we
would no longer have electrostatic equilibrium in the first place, which is defined by no net motion
of charges.
The second property is a result of the 1/r2 repulsion of like charges in Equation 3.4. If we
had excess charge inside a conductor, the repulsive forces between these excess charges would push
them as far apart as possible. Since the charges are mobile in a conductor, this happens readily.
Every like charge wants to maximize its distance from every other like charge, so excess charge
quickly migrates to the surface.
+
+
+
+
+
+
+++++
+
A
charge resides at the surface, E~ = 0 inside the conductor, and the di-
+
B
+
+
Note from the spacing of the positive signs that the surface charge den-
+
+
+
+
+
surface.
This is only true because Coulomb’s law (Equation 3.4) is an inverse square law! If it were some
other power law, like 1/r2+δ , even for very tiny δ, excess charges would exist inside the conductor,
which we could observe. One of many special facts about inverse square laws, which has been used
to test Coulomb’s law with fantastic precision.
The third property we also understand by thinking about what would happen if it were
not true. If the field was not perpendicular to the conductor’s surface, it would have to have a
component parallel to the surface. If that were true, free charges on the surface of the conductor
would feel this field, and therefore a force (Eq. 3.3) along the surface. Under this force, they would
subsequently flow along the surface, and once again, there is a net flow of charge, so we are by
definition not in electrostatic equilibrium.
The fourth property is perhaps easiest to understand geometrically, as a consequence of
the third property. The requirement that field lines be perpendicular to the surface forces them
to “bunch up” wherever the radius of curvature is small, at “sharp” points, see Fig. 3.8. The
presence of a sharp point with a high radius of curvature enhances the electric field in that region,
and as a result, the mobile surface charges will instantly flow to this region of high curvature.
They will do this until the electric field along the surface is cancelled. The sharper the point,
the more charges need to flow into the region to ensure that the parallel component of the surface
electric field is totally cancelled. This does result in an uneven surface charge density for irregularly
shaped conductors, but also an electric field which is uniform and perfectly normal to the surface
everywhere.
These rules might be easier to grasp pictorally.xii Figure 3.9 shows the field lines between
oppositely charged conducting plates – an example of a device known as a capacitor, which we
will study in Ch. 4. Note that the field in the region between the plates is very uniform, due
to the requirement that it be perpendicular to the surface of the conductors. Near the edges of
each plate, the field “fringes”, and starts to curve slightly outward. Further from the edges of the
plates, the field starts to resemble that of a dipole (Fig. 3.6, turned 90◦ ). This is no accident –
the excess charges on the very edges of the plates do essentially form a dipole, so viewed from far
away, the edges of this parallel plate structure look like a long row of dipoles stacked together.
Microscopically, this is almost exactly what is happening!
q conductor
E=0
Figure 3.9: (a) Field lines between two oppositely charged plates, (b) a point charge above a grounded conducting plane, and
(c) a point charge near a conducting sphere. Field lines must be perpendicular to the surface of a conductor at every point,
and their density increases near “sharp” points. Note also that there are no field lines inside the sphere, as the field inside a
conductor must be zero.
Figure 3.9 also shows the field lines due to a point charge suspended above a grounded conducting
plate. In this case, we again see that field lines always intersect the conducting surface at right
angles. Again, this resembles Fig. 3.6 – this looks like half of the dipole field, as if there were a
mirror halfway between the two charges. This is exactly what is happening – since the field lines
have to intersect the plate at right angles, the point charge a distance d from the conducting plate
behaves in the same way as if there were an equal and opposite charge a distance 2d away. Really,
a conductor is a mirror for electric field lines! One can use this as a problem-solving trick, known
as the “method of images.” This is a bit beyond the scope of the current text, but a neat time-
saving trick to be aware of. What this also means qualitatively is that when a charge is present
near a conductor, the charge induces an equal and opposite charge spread out on the surface of the
conductor. In this case, a charge q above the conducting plate induces an overall charge −q over
the whole surface of the conducting plate.
Finally, Fig. 3.9 shows a point charge near a hollow conducting sphere. Note that everywhere,
the field is perpendicular to the conducting sphere, and the field is zero inside the conductor. Oddly,
this figure looks a bit like what we would expect if the conducting sphere were replaced by another
xii
Appendix B may provide an interesting read for the mathematically inclined.
charge, opposite in sign but smaller than the existing point charge. Can you see why this might
be? As a hint, think about conductors being mirrors for field lines.
Question: All four properties are exemplified in Fig. 3.9, can you spot where?
Answer: 1 – inside the hollow sphere. 2 – inside the hollow sphere. 3 – true everywhere, check for yourself. 4 – ends of the plates.
research. Dr. van de Graaff can be considered the inventor of the first accelerator providing intense
particle beams of precisely controllable energy, and one of the pioneers of particle physics. 11,xiv
The principles of its operation can be understood using the properties of electric fields and
charges you have (hopefully) just learned. Figure 3.10 shows the basic construction of Dr. van de
Graaff’s device, and Fig. 3.11 shows illustrations from Dr. van de Graaff’s original patent on the
“Electrostatic Generator” from 1931. A motor-driven pulley moves a belt past positively-charged
metallic needles at position A. Negative charges are attracted to the needles from the belt, which
leaves the left side of the belt with a net positive charge. The moving belt transfers these positive
charges up toward the conducting dome.
The positive charges attract electrons on to the
conducting
+
+
+
belt as it moves past a second set of needles at point
dome
+ + B, which increases the excess positive charge on the
+ + dome. Because the electric field inside the conduct-
B +
+ + +
ing metal dome is negligible (it would be precisely
+ belt
+ zero if there were not holes in the dome), the positive
+
insulating
+
+ stand charge on it can be easily increased – near zero elec-
+
A+ + tric field means near zero repulsive force to add more
charge. The result is that extremely large amounts
ground of positive charge can be deposited on the dome.
This charge accumulation cannot occur indefi-
Figure 3.10: A diagram of a van de Graaff generator.
Charge is transferred to the dome by means of a rotating
nitely. Eventually, the electric field due to the
belt. The charge is deposited on the belt at point A and
transferred to the dome at point B. charges becomes large enough to ionize the surround-
ing air, increasing the air’s conductivity. When suf-
ficiently ionized, the air is nicely conducting, and the charges may rapidly flow off of the dome
through the air – a “spark” jumps off of the dome to the nearest ground point. A spectacular
example of this can be seen in Figure 3.12.
Since the “sparks” are really charge flowing
off of the dome, this eventually limits the high-
est electric fields obtainable. The easy solution
to increase the voltage is to make the domes
bigger (decrease their radius of curvature), and
put them higher off the ground (the farther a
“spark” has to go, the more electric field it takes
to create one).
One of the largest Van de Graaff generators
Figure 3.11: Images from van de Graaff ’s original patent on
in the world, built by Dr. Van de Graaff him- the “Electrostatic Generator,” filed 16 December, 1931. 13
self, is now on permanent display at Boston’s
xiv
You might think that is how the Tuscaloosa airport got its name. You would be wrong. 12
Museum of Science (it is the one shown in Figure 3.12). It uses 15 foot aluminum spheres standing
on columns many feet tall, and can reach 2 million volts. The Van de Graaff generator is operated
several times per day in the museum’s “Theater of Electricity.”
Van de Graaff information and pictures can be found through the Museum of
Science:
http://www.mos.org/sln/toe/toe.html
More pictures of the largest van de Graaff generator, including its construction and his-
torical pictures, can be found through the MIT Institute Archives:
http://libraries.mit.edu/archives/exhibits/van-de-graaff/
An interesting article from the Tuscaloosa News about Robert van de Graaff:
http://bama.ua.edu/∼jharrell/PH106-S06/vandegraaff.htm
You can also visit his boyhood home at 1305 Greesnsboro Ave.
How do we use this sneaky law? First, we need the concept of electric flux, denoted by ΦE .
Electric flux is a measure of how much the electric field vectors penetrate a given surface. If the
electric field vectors are tangent to the surface at all points, they don’t penetrate at all and the
A normal
E θ E
Area = A ’ = A cos θ
Area = A
Figure 3.13: (a) Field lines representing a uniform electric field E penetrating a plane of area A perpendicular to the field.
~
The electric flux ΦE through this area is equal to |E|A. (b) Field lines representing a uniform electric field penetrating an
area A that is at an angle θ to the field. Because the number of lines that go through the area A0 is the same as the number
~ cos θ.
that go through A, the flux through A0 is given by ΦE = |E|A
flux is zero. Basically, we count the number of field lines penetrating the surface per unit area –
lines entering the inside of the surface are positive, those leaving to the outside are negative.
An analogy of electric flux is fluid flux, which is just the volume of liquid flowing through an
area per second. The electric flux due to an electric field E ~ constant in magnitude in direction
through a surface of area A is Φ = |E|A~ cos θEA , where θEA is the angle that E ~ makes with the
surface normal.
~ cos θEA
ΦE = |E|A (3.5)
where θEA is the angle between the normal and the electric field.
Consider the surface in Figure 3.13a. The electric field is uniform in magnitude and direction.
Field lines penetrate the surface of area A uniformly, and are perpendicular to the surface at every
~
point (θ = 0◦ ). The flux through this surface is just Φ = |E|A.
Now consider the surface A in Figure 3.13b. The uniform electric field penetrates the area A
~ cos θ. For the surface A0 , the field
that is at an angle θ to the field, so now the flux is ΦE = |E|A
lines are perpendicular, but the area is reduced by the same amount, so the flux is the same through
A and A0 .
Just like electric forces and fields, flux also obeys the superposition principle. If we have a
number of charges inside a closed surface, the total flux through that surface is just the sum of the
fluxes from each individual charge.
Now: on to Gauss’s law. What Gauss’s law actually relates is the electric flux through a closed
surface to the total electric charge contained inside that surface – the electric flux through a
closed surface is proportional to the charge contained inside the surface. To see how this
(a) (b)
works, consider the point charge in Figure 3.14a. The innermost surface is just a sphere, whose
radius we will call r. The strength of the electric field everywhere on this sphere is
~ = ke q
|E| (3.6)
r2
~ is
since every point on the sphere’s surface is a distance r from the charge. We also know that E
perpendicular to the surface everywhere, thanks to the radial symmetry. Finally, we know that the
surface area of a sphere is A = 4πr2 , so the electric flux is
~ = ke q
4πr2 = 4πke q
ΦE = |E|A 2
(3.7)
r
If the point charge is outside the surface, Fig. 3.14b, the net flux is zero through that surface
since the same number of field lines enter and leave. If no charge is enclosed by the surface, there
is no net flux.
Now the power in Gauss’s law is that if we take any arbitrarily more complicated surface, so long
as it surrounds the point charge q and doesn’t have holes in it, we will always get the same flux!
What this means is that we always choose very convenient surfaces, ones for which the electric field
is just a constant over the whole surface. For convenience, we define a new constant 0 = 1/4πke ,
known as the “permittivity of free space:”
1 C2
0 = = 8.85 × 10−12 (3.8)
4πke N · m2
Recall ke is Coulomb’s constant from Equation 3.2. (This means of course that we can put all
of our other equations, like Eq. 3.1, in terms of 0 instead of ke , since ke = 1/4π0 . You will often
see them this way.) This gives Gauss’s law a nice simple form:
Gauss’s law: the electric flux ΦE through any closed surface is equal to the net charge
inside the surface, Qinside , divided by 0 :
Qinside
ΦE = (3.9)
0
We will not derive Gauss’ law here, but simply state it as fact, and show you a few examples of
how to use it.
Fundamentally, Gauss’ law is a manifestations of the divergence theorem (a.k.a. Green’s theorem
or the Gauss-Ostrogradsky theorem). Essentially, it states that the sum of all sources minus the
sum of all sinks gives the net flow out of a region. The same law is applies to fluids. If a fluid
is flowing, and we want to know how much fluid flows out of a certain region, then we need to
add up the sources inside the region and subtract the sinks. The divergence theorem is basically a
conservation law - the volumetric total of all sources minus sinks equals the flow across a volume’s
boundary.
In the case of electric fields, this gives Gauss’ law (Eq. 3.9) – the electric flux through any closed
surface must relate to a net charge inside the volume bounded by that surface. The net magnitude
of the vector components of the electric field pointing outward from a surface must be equal to the
net magnitude of the vector components pointing inward, plus the amount of free charge inside.
This is a manifestation of the fact that electric field lines do have to originate from somewhere –
charges. The difference between the flow of field lines into a surface and the flow out of
a surface is just how many charges are within the surface, that is all that Gauss’ law
says. This is fundamentally due to the fact that for all inverse square laws, like Coulomb’s law or
Newton’s law of gravitation, the strength of the field falls off as 1/r2 , but the area of an enclosing
surface increases as r2 . The two dependencies cancel out, and we are left with the result that the
flux is only related to difference between the number of enclosed sources and sinks.
Though Gauss’s law is very powerful, it is usually used in specially symmetric cases (spheres,
cylinders, planes) where it is easy to draw a surface of constant electric field around the charges
of interest (like a sphere around a point charge). We will work through a few of these examples
presently.
Question: Why would we not want to choose a cube as our surface enclosing the point
charge?
Choosing a cube would not give us any nice surfaces with a constant electric field on them.
Qinside1
ΦE = EA = R1 > R (3.10)
0
Qinside1
= E × 4πR12 = R1 > R (3.11)
0
Qinside1 ke Qinside1
=⇒ E = = R1 > R (3.12)
4π0 R12 R12
What we now see is that this is the same thing as the field from a point charge – the field outside
a spherically symmetric charge distribution behaves exactly as if all of its charge is concentrated
at the center. This is, in fact, a particular property of 1/r2 laws, and you should recall that this
principle is true in the gravitational case for spherically symmetric mass distributions. The earth
xv
Appendix B may help you think about that.
attracts other bodies as if its mass were concentrated at a point in the center. So long as we
are dealing with spherically symmetric distributions, it is not even an approximation to deal with
infinitesimal point charges!
One thing to keep in mind: this is not something like the center of mass. A perfect cube does
not behave as if it had all its mass concentrated at its center. This all really comes from the nature
of 1/r2 forces and the divergence theorem.
What about surface 2, radius R2 , drawn inside the charge distribution? From the analysis above,
all that matters is how much charge is contained inside the surface. Everything outside the surface
contributes an equal contribution, but in all different directions, and the whole thing cancels. What
is outside the surface may just as well not exist, so far as the electric field is concerned. Finding
the field at point P2 is then just a matter of figuring out how much charge is inside the second
surface. Depending on the distribution, that may not be so easy ... but it would have been a lot
worse without Gauss’ law.
We have actually developed a more important result than we set out to. Using only Gauss’ law,
we have derived that the field from spherically symmetric charge distributions is equivalent to that
of a point charge, and follows a 1/r2 law. Actually, we have derived Coulomb’s law from Gauss’
law. In fact, the two are equivalent. We could have started from Gauss’ law in the first place
and arrived at Coulomb’s law, instead of assuming Coulomb’s law to be true and then introducing
Gauss’ law. Gauss’ law is in fact far more general in an important way, as we have noted above,
since it gives the equivalence relationship for any flux (e.g., liquids, electric fields, gravitational
fields) flowing out of any closed surface and the enclosed sources and sinks of the flux (e.g., electric
charges, masses). We will see in Ch. 7.2.1 that there is also a Gauss’ law for magnetism, just as
there is a Gauss’ law for gravity, viz.:
Φg = 4πGM (3.13)
where Φg is the flux from the gravitational field through a closed surface, G is the universal
gravitational constant, and M is the mass enclosed by the surface. Just as we proved that any
spherically symmetric charge distribution behaves as a point charge and follows an inverse square
law, one can prove that any spherically symmetric mass distribution is equivalent to a point mass,
and follows the familiar inverse square law for gravitation.
uniform. Since we do not want to restrict ourselves to a plate of any particular size, but rather,
solve a general problem, we will say that the plate has a certain charge per unit area σE , defined
as the total charge of the plate divided by its surface area. That way, we can later find the field
near any plate.
What sort of surface should we take to find the flux? A plain box is a good choice, as it turns
out, due to the symmetry of the problem. We will take a box with a top and bottom whose area
are A. The area of the sides are not important, as it turns out, but we can call them B just to be
complete.
Why would we choose a box in this case, when we just said it is a bad choice for a point charge?
We know that the field is perpendicular to the surface of a conductor everywhere, so in this case
the field is going to be purely perpendicular to the plate. Therefore, it is only important that we
draw a Gaussian surface such that every part of the surface is either perfectly parallel or perfectly
perpendicular to the plate. A cylinder would work perfectly fine too, which should be clear from
the rest of the discussion.
Along the surface making up the sides of the
(a) (b)
box, the flux is zero since the field lines are par-
allel to it everywhere. On the top end cap, the
flux is perfectly normal. The bottom end cap
is completely inside the conductor, so we know
the field has to be zero there! If we call the
magnitude of electric field above the plate E,
we can readily calculate the flux. Because the
Figure 3.16: (a) A large, flat charged conducting slab. The
plate is supposed to be very large in extent, the charge distribution on the surface, and hence electric field, are
field can be assumed to be completely uniform uniform. (b) A cylinder is our surface for Gauss’ law. Along
the sides of the cylinder, the flux is zero since the field lines are
so long as the distances above the plate we con- parallel – the flux is non-zero only through the end caps.
sider are small compared to the size of the plate.
The total charge enclosed by this cylinder is just the cross-sectional area of the plate enclosed
by the box times the charge per unit area σE :
Qinside = σE A (3.14)
Applying Gauss’ law is now straightforward, we just have to find the flux through the top end cap:
Qinside σE A
ΦE = EA = = (3.15)
0 0
σE
=⇒ E = (3.16)
0
No problem! The electric field is indeed constant, as it has to be, and independent of the distance
from the plate. This makes sense too, since the plate is supposed to be very, very large. Strictly,
this is true only for an infinite plate, but it is close so long as we consider distances above the plate
which are very small compared to the size of the plate. Finally, it should be clear now that it didn’t
matter what sort of shape we used at all, so long as it has a flat end parallel to the plate, and sides
perpendicular to it.
3.8.5 Example: The Field Inside and Outside a Hollow Spherical Conductor
Qinside
ΦE = EA = r < R1 (3.17)
0
Q1
= E × 4πr2 = r < R1 (3.18)
0
1
Now we just need to solve for E, and make use of the fact that 0 = 4πk e
(Eq. 3.8):
Q1 ke Q1
E= 2
= 2 r < R1 (3.19)
4π0 r r
Of course, this makes perfect sense – the field inside is just that of the point charge, as if the
conductor were not there at all! As we saw above, electric fields are like gravitational fields in this
way – inside a spherical shell, both the gravitational and electrical forces cancel in all directions by
symmetry.
Next, we consider surface 2, a surface inside the conductor itself. We know already that every-
where inside the conductor, i.e., R1 < r < R2 , we must have E = 0. Done! That seemed too easy,
didn’t it? It was – we missed one little point.
In the end, we also want to find the field outside the shell entirely, and for this we have to
consider surface 1. Now we have to be careful, and think about what we have missed. For surface
2, drawn inside the conductor, we said E = 0 as it has to be for a conductor. This is true. But
how can that be, with a point charge sitting right inside? Actually, it can’t: what happens is that
the point charge Q1 induces a equal but opposite charge Q2 = −Q1 on the inside surface of the
conductor. Think of it this way – if this did not happen, then the total charge enclosed by surface 2
would not be zero, and by Gauss’ law the field inside the conductor could not be zero. The induced
charge Q2 ensures that the total charge enclosed by surface 2 is zero, and thus the field inside the
conductor is zero as it has to be. Then we would have a contradiction on our hands, which is not
OK. This also is another aspect of conductors looking like mirrors for field lines. Physically, the
charge Q1 attracts opposite mobile charges in the conductor, giving a net negative charge on the
inner surface.
Now, what about surface 3? Before we placed the charge Q1 inside the conductor, it was
electrically neutral. This still has to be true after we place the charge – overall, the conductor
must have no net charge. Well, if there is a charge Q2 = −Q1 on the inner surface, and overall it is
neutral, then there must be a charge Q3 = Q1 induced on the outer surface to cancel the induced
charge on the inner surface. The net negative charge on the inner surface attracted by the point
charge Q1 leaves a deficit of negative charge on the outer surface, for a net positive surface charge.
Now we can run Gauss’ law for surface 3:
Qinside
ΦE = EA = r > R2 (3.20)
0
Q1 + Q2 + Q3
= E × 4πr2 = r > R2 (3.21)
0
Q 1 − Q 1 + Q1
= 4πr2 E = r > R2 (3.22)
0
Q1
=⇒ E = r > R2 (3.23)
4π0 r2
Lo and behold, the field outside the sphere looks just like that of the original point charge, same
1
as inside the sphere (remembering that 0 = 4πk e
, Eq. 3.8). Again, what happens physically is that
the point charge pulls the mobile charges from the conductor to its inner surface, leaving the inner
surface with an equal and opposite charge. This means that the outer surface must be deficient in
those same charges, and thus has an equal and like charge to Q1 .
Now we can combine our results, and we have the electric field in all three regions:
ke Q1
E= r > R2 (3.24)
r2
E=0 R 1 < r < R2 (3.25)
ke Q1
E= r < R1 (3.26)
r2
As one last example, we will use Gauss’ law to find the electric field due to an infinite line of charge,
or equivalently, a conducting wire with a net surface charge, as shown in Fig. 3.18a. What does
the electric field look like? If the line of charge is infinite (or at least very long compared to the
distance we are away from it), all of the transverse components of the field will cancel each other,
and by symmetry, the field must be radially symmetric about the wire. That is, the field must
point perpendicularly away from the wire axis.
With the symmetry of the wire being cylin-
drical, it makes most sense to use a cylinder
(a)
drawn concentrically around the wire as our
Gaussian surface, Fig. 3.18b. We will choose ++++++++++++++++++++
a cylinder of radius r, and length l. The field is
parallel to the end caps of the cylinder, so they (b)
contribute no flux at all. Being radially sym-
E
metric, the field is perfectly perpendicular to the
round surface of the cylinder, and we can eas- + + + + + + + + + + + + + + + + ++ ++ ++ ++
ily calculate the flux and find the electric field.
First, we remember that the surface area of a
cylinder (without the end caps) is 2πrl. Sec-
ond, the cylinder of length l encloses a length l Figure 3.18: (a) An “infinite” line charge, with λ charges per
of the wire, which must contain λl charges since unit length. (b) A cylindrical Gaussian surface. On the caps
of the cylinder, the field is parallel, and the flux is zero.
λ is the charge per unit length. Putting that all
together:
Qencl λl
ΦE = E · 2πrl = = (3.27)
0 0
λ 2ke λ
=⇒ E = = (3.28)
2πr0 r
In this case, the field falls off as 1/r, far slower than a point charge, but not independent of
distance like we found for the sheet of charge. It is independent of the length of the cylinder we
chose, as it must be: the wire is supposed to be infinite, and the value of l was chosen arbitrarily!
3.9 Miscellanea
Solving electric field problems
1. Convert all units to SI – charges in Coulombs, distances in meters.
2. Draw a diagram of the charges in the problem.
3. Identify the charge of interest, and what you want to know about it.
4. Choose your coördinate system and origin – pick the most convenient one based
on the symmetry of the problem. Usually, this is an x−y Cartesian system, with
the origin at some special point (e.g., on one charge or between two charges)
5. Apply Coulomb’s law For each charge Q, find the electric force on the charge
of interest q. The vector direction of the force is along the line of the two charges,
directed away from Q if it has the same sign as q and toward Q if it has the
opposite sign as q. Find the angle θ this vector makes with the positive x axis – the
x component of the electric force will be F cos θ, the y component will be F sin θ.
6. Sum the x components from each charge Q to get the resultant x component of
the electric force.
7. Sum the y components from each charge Q to get the resultant y component of
the electric force.
8. Find the total resultant force from the total x and y components, using the
Pythagorean theorem and trigonometry to find the magnitude and direction:
p F
|Ftot | = |Fx |2 +|Fy |2 and tan θ = Fxy
1. Two charges of +1 µC each are separated by 1 cm. What is the force between them?
0.89 N
90 N
173 N
15 N
4. Which of the following is true for the electric force, but not the gravitational force?
5. Two charges of +1 µC are separated by 1 cm. What is the magnitude of the electric field
halfway between them?
9 × 107 N/C
4.5 × 107 N/C
0
1.8 × 108 N/C
6. A circular ring of charge of radius b has a total charge of q uniformly distributed around it.
The magnitude of the electric field at the center of the ring is:
0
ke q/b2
ke q 2 /b2
ke q 2 /b
none of these.
7. Two isolated identical conducting spheres have a charge of q and −3q, respectively. They
are connected by a conducting wire, and after equilibrium is reached, the wire is removed (such
that both spheres are again isolated). What is the charge on each sphere?
q, −3q
−q, −q
0, −2q
2q, −2q
8. A single point charge +q is placed exactly at the center of a hollow conducting sphere of
radius R. Before placing the point charge, the conducting sphere had zero net charge. What is
the magnitude of the electric field outside the conducting sphere at a distance r from the center
of the conducting sphere? I.e., the electric field for r > R.
~ = − ke2q
|E| r
~ = ke q 2
|E| (R+r)
~ = ke2q
|E| R
~ = ke2q
|E| r
10. Referring again to the figure above, which set of electric field lines could represent the
electric field near two charges of opposite sign and different magnitudes?
a
b
c
d
11. A “free” electron and a “free” proton are placed in an identical electric field. Which of the
following statements are true? Check all that apply.
Each particle is acted on by the same electric force and has the same acceleration.
The electric force on the proton is greater in magnitude than the force on the electron, but in
12. A point charge q is located at the center of a (non-conducting) spherical shell of radius
a that has a charge −q uniformly distributed on its surface. What is the electric field for all
points outside the spherical shell?
none of these
E =0
E = q/4πr2
E = kq/r2
E = kq 2 /r2
13. What is the electric field inside the same shell a distance r < a from the center (i.e., a point
inside the spherical shell)?
E = kq/r2
E = kq 2 /r2
none of these
E =0
E = q/4πr2
- -2C
+
14. What is the electric flux through the surface at +
right? +2C +3C
+5 C/0 +5C -
−3 C/0
0 + -
-2C
+6 C/0
-1C
15. A spherical conducting object A with a charge of +Q is lowered through a hole into a metal
(conducting) container B that is initially uncharged (and is not grounded). When A is at the
center of B, but not touching it, the charge on the inner surface of B is:
+Q
−Q
0
+Q/2
−Q/2
17. A flat surface having an area of 3.2 m2 is rotated in a uniform electric field of magnitude
E = 5.7 × 105 N/C. What is the electric flux when the electric field is parallel to the surface?
1.82 × 106 N · m2 /C
0 N · m2 /C
3.64 N · m2 /C
0.91 N · m2 /C
3.11 Problems
+10-6 C
−6
1. Two charges of +10 C are separated by 1 m along
the vertical axis. What is the net horizontal force on a 1m
charge of −2×10−6 C placed one meter to the right of the -2x10-6 C
lower charge? +10-6 C
1m
2.0 m
2. Three point charges lie along the x
axis, as shown at left. A positive charge r23 2.0 m - r23
q1 = 15 µC is at x = 2 m, and a positive
charge of q2 = 6 µC is at the origin. Where
must a negative charge q3 be placed on
the x-axis between the two positive + - + x
charges such that the resulting electric q2 E23 q3 E13 q1
force on it is zero?
q1
1. 90 N. We just need to use Eq. 3.1 and plug in the numbers ... remembering that µ means
10−6 :
~ q1 q2
F = ke 2 r̂12
r12
q 1 q2
|~
F| = ke 2
r12
" #
N · m2 1 × 10−6 C 1 × 10−6 C
9
= 8.9875 × 10 2
C2 (1 × 10−2 m)
" #
2
N ·
m2
1 × 10−12
C
≈ 9 × 109 2 2
C
1 × 10−4
m
= 9 × 101 N
|~
F| = 90 N
2. Always zero. Re-read Sect. 3.5 to remind yourself why this must be true.
4. The force between two bodies can be repulsive as well as attractive. Both the
electric and gravitational forces propagate at the speed of light, both act through empty space,
and both are inverse-square laws. The only difference is that gravity can only be attractive,
since there is no such thing as negative mass.
5. 0. Halfway between, the magnitude of the field from each individual charge is the same,
but they act in opposite directions. Therefore, exactly in the middle, they cancel, and the field
is zero. This is the same as the field exactly at the midpoint of an electric dipole. It might be
easier to convince yourself the field is zero if you draw a picture including the electric field lines.
6. 0. The field at the center from a point on the ring is always canceled by the field from
another point 180◦ away.
7. −q, −q. The thing to remember is that any charge on a conductor spreads out evenly over
its surface. When we have the conducting spheres isolated, they have q and −3q respectively,
and this charge is spread evenly over each sphere. When we connect them with a conducting
wire, suddenly charges are free to move from one conductor, across the wire, into the other
conductor. Its just the same as if we had one big conductor, and all the total net charge of the
two conductors combined will spread out evenly over both spheres and the wire.
If the charge from each sphere is allowed to spread out evenly over both spheres, then the −3q
and +q will both be spread out evenly everywhere. The +q will cancel part of the −3q, leaving
a total net charge of −2q spread over evenly over both spheres, or −q on each sphere. Once we
disconnect the two spheres again, the charge remains equally distributed between the two.
~ = ke2q . The easiest way out of this one is Gauss’ law. First, Gauss’ law told us that any
8. |E| r
spherically symmetric charge distribution behaves as a point charge. Second, Gauss’ law tells
us that the electric flux out of some surface depends only on the enclosed charge. If we draw
a spherical surface of radius r and area A around the shell and point charge, centered on the
center of the conducting sphere, Gauss’ law gives:
qencl
ΦE = = 4πke qencl
0
EA = 4πke qencl
4πke qencl
E =
A
The surface area of a sphere is A = 4πr2 . In this case, the enclosed charge is just q, since the
hollow conducting sphere itself has no charge of its own. Gauss’ law only cares about the total
net charge inside the surface of interest. This gives us:
4πke q 4
πke q ke q
E= = 2 = 2
4πr2 4
πr
r
There we have it, it is just the field of a point charge q at a distance r.
If we want to get formal, we should point out that the point charge q induces a negative charge
−q on the inner surface of the hollow conducting sphere. Since the sphere is overall neutral, the
outer surface must therefore have a net positive charge +q on it. This makes no difference in
the result – the total enclosed charge, for radii larger than that of the hollow conducting sphere
(r > R), is still just q. If we start with an uncharged conducting sphere, and keep it physically
isolated, any induced charges have to cancel each other over all.
If this is still a bit confusing, go back and think about induction charging again. A charged
rod was used to induce a positive charge on one side of a conductor, and a negative charge on
the other. Overall, the ‘induced charge’ was just a rearrangement of existing charges, so if the
conductor started out neutral, no amount of ‘inducing’ will change that. We only ended up with
a net charge on the conductor when we used a ground connection to ‘drain away’ some of the
induced charges. Or, if you like, when we used a charged rod to repel some of the conductor’s
charges through the ground connection, leaving it with a net imbalance.
9. (b). If the charges are of the opposite sign, then the field lines would have to run from one
charge directly to the other. Field lines start on a positive charge and end on a negative one,
and there should be many lines which run from one charge to the other. Since opposite charges
attract, the field between them is extremely strong, the lines should be densest right between
the charges. This is the case in (a) and (b), so they are not the right ones.
By the same token, for charges of the same sign, the force is repulsive, and the electric field
midway between them cancels. The field lines should “push away” from each other, and no field
line from a given charge should reach the other charge – field lines cannot start and end on the
same sign charge. This means that only (b) and (d) could possibly correspond to two charges
of the same sign.
Next, the field lines leaving or entering a charge has to be proportional to the magnitude of
the charge. In (d) there are the same number of lines entering and leaving each charge, so
the charges are of the same magnitude. One can also see this from the fact that the lines are
symmetric about a vertical line drawn midway between the charges. In (b) there are clearly
Or, right off the bat, you could notice that only (a) and (b) are asymmetric, and only (b) and
(d) look like two like charges. No sense in over-thinking this one.
10. (a). By similar reasoning as above, only figure a could represent two opposite charges of
different magnitude.
11. The electric force on the proton is equal in magnitude to the force on the electron, but in
the opposite direction. The magnitude of the acceleration of the electron is greater than that
of the proton.
12. E = 0. The simplest way to solve this one is with Gauss’ law. First, Gauss law told us
that any spherically symmetric charge distribution behaves as a point charge. Second, Gauss
law tells us that the electric flux out of some surface depends only on the enclosed charge. If
we draw a spherical surface of radius r and area A enclosing the shell and the point charge,
centered on the center of the conducting sphere, the total enclosed charge is that of the shell
plus that of the point charge: qencl = q +(−q) = 0. If the enclosed charge is zero for any sphere
drawn outside of and enclosing the spherical shell, then the electric field for all points outside
the spherical shell.
13. E = ke q/r2 . Just like the last question, we need Gauss’ law. This time, we have to draw
a sphere surrounding the point charge, but inside of the spherical shell. Gauss’ law tells us
that the electric field depends only on the enclosed charge within our sphere. The only charge
enclosed is the point charge at the center of the shell, q – the charge on the spherical shell is
outside of our spherical surface, so it is not enclosed and does not contribute to the electric field
inside. Now we just apply Gauss’ law, knowing that the enclosed charge is q, and the surface
area of the sphere is 4πr2 :
qencl
ΦE = = 4πke q (3.29)
0
EA = 4πke q (3.30)
4πke q ke q
E = 2
= 2 (3.31)
4πr r
14. +6 C/0 . Again, this question requires Gauss’ law. We know that the electric flux through
this surface only depends on the total amount of enclosed charge. All we need to do is add up
the net charge inside the surface, since any charges outside the surface do not contribute to the
flux. There are only three charges enclosed by the surface ... so:
The electric flux ΦE is then just the enclosed charge divided by 0 , or +6 C/0 .
15. −Q. The charge +Q on object A induces a negative charge −Q on the inner surface of the
conducting container B.
16. 1.8 m to the left of the negative charge. By symmetry, we can figure out on which side
the field should be zero. In between the two charges, the field from the positive and negative
charges add together. The force on a fictitious positive test charge placed in between the two
would experience a force to the left due to the positive charge, and another force to the left due
to the negative charge. There is no way the fields can cancel here.
If we place a positive charge to the right of the positive charge, it will feel a force to the right
from the positive charge, and a force to the left from the negative charge. The directions are
opposite, but the fields still cannot cancel because the test charge is closest to the larger charge.
This leaves us with points to the left of the negative charge. The forces on a positive test charge
will be in opposite directions here, and we are closer to the smaller charge. What position gives
zero field? First, we will call the position of the negative charge x = 0, which means the positive
charge is at x = 1 m. We will call the position where electric field is zero x. The distance from
this point to the negative charge is just x, and the distance to the positive charge is 1 + x. Now
write down the electric field due to each charge:
ke (−2.5 µC)
Eneg =
x2
ke (6 µC)
Epos =
(1 + x)2
Eneg + Epos = 0
ke (−2.5 µC) ke (6 µC)
+ = 0
x2 (1 + x)2
ke (−2.5
µC)
ke (6
µC)
+ = 0
x2 (1 + x)2
−2.5 6
2
+ = 0
x (1 + x)2
2.5 6
⇒ =
x2 (1 + x)2
we already ruled out. The positive root, x = 1.82, means a distance 1.82 m to the left of the
negative charge. This is what we want.
17. 0 N · m2 /C. Remember that electric flux is ΦE = EA cos θ, where θ is the angle between
a line perpendicular to the surface and the electric field. If E is parallel to the surface, then
θ = 90 and ΦE = 0.
Put more simply, there is only an electric flux if field lines penetrate the surface. If the field is
parallel to the surface, no field lines penetrate, and there is no flux.
18. 16 µN, down (-90◦ ). The easiest way to solve this one is by symmetry and elimination.
The negative charge q2 feels an attractive force from both q1 and q2 . Since both charges are
the same vertical distance away and below q2 , both will give a force in the vertical downward
direction of equal magnitude and direction. Since both charges are horizontally the same direc-
tion away but on opposite sides, the horizontal forces will be equal in magnitude but opposite
in direction – the horizontal forces will cancel. Therefore, the net force has to be purely in the
vertical direction and downward, so the second choice is the only option! Of course, you can
calculate all of the forces by components and add them up ... you will arrive at the same answer.
1. −0.0244 N. We are only interested in the x component of the force, which makes things
easier. First, we are trying to find the force on a negative charge due to two positive charges.
Both positive charges are to the left of the negative charge, and both forces will be attractive.
We will adopt the usual convention that the positive horizontal direction is to the right and
called +x, and the negative horizontal direction is to the left and called −x.
First, we will find the force on the negative charge due to the positive charge in the lower left,
which we will call “1” to keep things straight. We will call the negative charge “2.” This is easy,
since the force is purely in the −x direction:
q1 q2
Fx,1 = ke 2
r12
10−6 C · −2 × 10−6 C
9 2 2
= 9 × 10 N · m /C 2
(1 m)
= 9 × 109 N ·
m C2 −2 × 10−12
2
/ C2
m2
/
= −18 × 10−3
So far so good, but now we have to include the force from the upper left-hand positive charge,
which we’ll call “3.” We calculate the force in exactly the same way, with two little difference:
the separation distance is slightly larger, and now the force has both a horizontal and vertical
component. First, let’s calculate the magnitude of the net force, we’ll find the horizontal com-
ponent after that.
√ √
√ 3 and 2 has to be 2 · 1◦ m, or 2 m
Plane geometry tells us that the separation between charges
– connecting the charges with straight lines forms a 1-1- 2 right triangle, with 45 angles.
q2 q3
Fnet,3 = ke 2
r23
10−6 C · −2 × 10−6 C
9 2 2
= 9 × 10 N · m /C √ 2
2m
2
−2 × 10−12 C
9 × 109 N · C2
2
= m/
2m2
−3
= −9 × 10 N
√
So the net force from the upper left charge is just half as much, since it is a factor 2 farther
away. We only want the horizontal component though! Since we are dealing with a 45-45-90
triangle here, the horizontal component is just the net force times cos 45◦ :
The total horizontal force is just the sum of the horizontal forces from the two positive charges:
2. x = 0.77 m. We have two positive charges and one negative charge along a straight line. If we
want there to be no net force on the negative charge, the electric forces from both of the positive
charges on it must cancel. For that to happen, there is only one possibility: the negative charge
has to be between the two positive charges. Outside that middle region, both positive charges
will exert an attractive force on the negative charge in the same direction, and there is no way
they can cancel each other. Only in the middle region do the forces from both positive charges
act in opposite directions on a negative charge, and only there can they cancel each other. We
want to find the position r23 such that both forces are equal in magnitude. All charges are on
the x axis, so the problem is one-dimensional and does not require vectors.
Intuitively, we know that the negative charge q3 must be closer to the smaller of the positive
charges. Since electric forces get larger as separation decreases, the only way the force due to
the larger charge can be the same as that due to the smaller charge is if if the negative charge
is farther away from the larger charge.
Let F32 be the force on q3 due to q2 , and F31 be the force on q3 due to q1 , and we will take
the positive x direction to be to the right. Since both forces are repulsive, F32 acts in the −x
direction and must therefore be negative, while F31 acts in the +x direction and is positive. This
is only true for the region between the two positive charges! Elsewhere, both positive charges
would give an attractive force, and there is no way they could cancel each other. We are not
told about any other forces acting, so our force balance is this:
It didn’t really matter which one we called negative and which one we called positive, just that
they have different signs. The separation between q2 and q3 is r23 , and the separation between
q1 and q3 is then 2−r23 . Now we just need to down the electric forces. We will keep everything
perfectly general, and plug in actual numbers at the end ... this is always safer.
F32 = F31
ke q3 q2 k3 q3 q2
2 = 2
r23 (2 − r23 )
keq3 q2 k3
q3 q1
2 = 2
r23 (2 − r23 )
q2 q1
2 = 2
r23 (2 − r23 )
Note how this doesn’t depend at all on the actual magnitude or sign of the charge in the middle!
From here, there are two ways to proceed. We could cross-multiply, use the quadratic formula,
and that would be that. On the other hand, since we know that q3 is supposed to be between
the other two charges, then r23 must be positive, and less than 2. That means that we can just
take the square root of both sides of the equation above without problem, since neither side
q2 q1
2 = 2
r23 (2 − r23 )
√ √
q2 q1
=⇒ =
r23 2 − r23
√ √
q2 (2 − r23 ) = q1 r23
√ √ √
2 q2 − q2 r23 = q1 r23
√ √ √
2 q2 = ( q2 + q1 ) r23
√
2 q2
r23 = √ √
q2 + q1
Plugging in the numbers we were given (and noting that all the units cancel):
√ √ √ √
2 q2 2 6 µC 2 6 2 2
r23 =√ √ =√ √ =√ √ =√ √ ≈ 0.77 m
q2 + q1 6 µC + 15 µC 6 + 15 2+ 5
√
For that very last step, we factored out 3 from the top and the bottom. An unnecessary step
if you are using a calculator anyway, but we prefer to stay in practice.
The more general solution is to go back before we took the square root of both sides of the
equation and solve it completely:
q2 q1
2 = 2
r23 (2 − r23 )
2 2
q2 (2 − r23 ) = q1 r23
2 2
q2 4 − 4r23 + r23 = q1 r23
2
(q2 − q1 ) r23 − 4q2 r23 + 4q2 = 0
q
2
4q2 ± (−4q2 ) − 4 (q2 − q1 ) · 4q2
r23 = m
2 (q1 − q2 )
q
2
4 · 6 µC ± (−4 · 6 µC) − 4 (6 µC − 15 µC) · 4 · 6 µC
= m
2 (6 µC − 15 µC)
xvi
This would not work if we wanted the point to the left of q2 .
p
24 ± 242 − 4(−9)(4)(6)
r23 = m
2(−9)
p
24 ± 242 + 36(24)
= m
−18
√
−24 ∓ 1440
= m
18
= (0.775, −3.44) m
Just as we expected: one solution (r23 = 0.775 m) is right between the two charges, a little bit
closer to the smaller charge. What about the positive solution? This corresponds to a position
far away from both charges 3.44 m to the left of q2 . As stated above, the forces act in the same
direction outside of the middle region, and cannot cancel! This solution is physically impossible,
just an artifact of the mathematics. We specified originally that the equations were only good
for the middle region, so if we get an answer that falls outside we must discard it as outside the
scope of our equations.
Our equations as we have written them do not take into account the fact that the fields change
direction on one side of a charge versus the other. Properly speaking, outside the middle re-
gion between the positive charges, we should write F32 = −F31 since the forces act in the same
direction. Try repeating the problem starting there, and you will find that there are no real
(non-imaginary) solutions outside the middle region - two positive forces cannot add up to zero.
Remember: in the end, we always need to make sure that the solutions are physically sensible
in addition to being mathematically correct.
P OTENTIAL energy and the principle of conservation of energy often let us solve difficult
problems without dealing with the forces involved directly. More to the point, using an
energy-based approach to problem solving let us work with scalars instead of vectors. This way we
get to deal with just plain numbers, which is nice.
In this chapter, we will learn that, as with the gravitational
field, the electric field has an associated potential and potential
energy. The electric potential will, in many cases, let us solve
problems more easily than with the electric field and, as it turns
out, electric potential is what we normally identify with ‘voltage’
in everyday life.
∆P E = P Ef − P Ei = −WF (4.1)
where the subscripts f (i) refer to the final (initial position), and WF is the work done by
the conservative force ~
F.
This is just how you dealt with gravity – moving an object of mass m through a vertical
displacement h gives a changes in potential energy ∆P E = mgh. Electrical forces and gravitational
105
106 4.1 Electrical Potential Energy
forces have a number of useful similarities, as you now know, and the same is true for their respective
potential energies.
i Work ~
done moving a charge q in a constant electric field E:
∆WAB = ~ x = |~
F · ∆~ F| |∆x| cos θ = qEx (xf − xi ) = qEx ∆x (4.2)
Note that q, Ex , and ∆x can all be either positive or negative. Also recall that Ex is the x-
~ not the magnitude! Equation 4.2 is valid for the work done on a
component of the electric field E,
charge by any constant electric field, no matter what the direction of the field, or sign of the charge.
Just remember that the angle between the field and displacement does matter!
High PE Low PE
+ -
!
E
+ A B -
∆!x Figure 4.2: When a charge q moves in a uniform elec-
x
xi xf ~ from point A to point B, covering a distance
tric field E
0 + -
∆x, the work done on the charge by the electric force is
qEx ∆x.
+ q + ! -
qE
+ -
∆x = xf − xi
Now that we have found the work done by the electric field, the work-energy theorem gives us
the potential energy change:
i ~ ·B
At this point you may want to remind yourself about the scalar or “dot” product, A ~ = |A||B| cos θAB , where
~ ~
θAB is the angle between A and B.
~ |∆~
∆P E = −WAB = −q|E| x| cos θ = −qEx ∆x (4.3)
Remember, just like any other work, the work done involving the electric force only counts the
displacement parallel to the force. You can find the component of the field parallel to the full
displacement, or find the component of the displacement parallel to the field – it is the same thing.
Figure 4.3 compares a charge moving in an electric field to a mass moving in a gravitational field.
A positive charge moving in an electric field acts much like a mass moving in a gravitational field:
the positive charge at point A falls in the direction of the field, just as the mass does. This lowers
its potential energy, and increases its kinetic energy.
Assuming other forces are absent, we can also find the kinetic energy change through conservation
of energy. Since both the electrical and gravitational forces are conservative, we can find the changes
in kinetic and potential energy in both cases and compare them. In both situations, the change
in potential energy must be equal and opposite the change in kinetic energy for energy to be
conservedii :
For the gravitational case, we have done this a million times for an object of mass m starting
at a height d and ending at a height defined as 0:
For the electrical case, it is not much more difficult. We will move a charge q through an electric
field E:
ii
The subscripts i and f refer to initial and final, as usual.
In Chapter 3, it was convenient to define E ~ related to the electric force, viz., ~ ~ This let
F = q E.
us think about individual charges one at a time, even when our system was a collection of several
charges, and discard the idea of “action at a distance.” For the same reasons, we would like to
define a variation of the electrical potential energy per unit charge, so we may think about how
much potential energy would be gained or lost by a single charge present in an electric field.iii
iii
This is similar to the chemical potential in a way, if you are familiar with that.
This quantity is the electric potential difference ∆V , and it is related to potential energy by
∆P E = q∆V .iv
∆P E
∆V = VB − VA = or q∆V = ∆P E (4.12)
q
Electric potential is measured in Joules per Coulomb, otherwise known as Volts. Just like
gravitational potential, electric potential is a scalar quantity. It is essentially a measure of the
change in electric potential energy per unit charge. By definition, it takes 1 J to move 1 C worth
of charge between two points with a potential difference of 1 V. If a 1 C charge moves through a
potential difference of 1 V, it gains 1 J of potential energy.
Consider the special case of a single charge q moving through a region of constant electric field,
such as the area between two parallel charged plates (Fig. 3.9). If the displacement of the charge
∆x is perfectly parallel to the electric field, we can divide Equation 4.3 by q to find the potential
difference ∆V :
~
Single charge q in a constant electric field E
∆P E ~ |∆~
∆V = = −|E| x| cos θ = −Ex ∆x (4.13)
q
This lets us see that potential difference also has units of electric field times distance. This makes
sense in a way, since for there to be an electrical potential difference we pretty much have to move
through an electric field. Since electric field has the units of newtons per coulomb (N/C), we can
make the following observation:
A newton (N) per coulomb (C) equals a volt (V) per meter (m): 1 N/C = 1 V/m
If we release a positive charge, it spontaneously accelerates from regions of high potential to low
potential - positive charges seek out minima in the electric potential. Conversely, negative charges
iv
The gravitational potential is the potential energy per unit mass, which is just gh for terrestrial cases, or −Gm
r
for the more general case. We would say that the potential energy difference between two points whose height differs
by h is mgh, while the potential difference is just gh.
seek out maxima in electric potential. Work must be done on positive charges to move them
toward higher potential, work must be done on negative charges to move them to regions of lower
potential.
q
V = ke (4.14)
r
where r is the distance from the point charge q, and ke is Coulomb’s constant (Eq. 3.2).
This gives us the electric potential – work per unit charge – required to move the charge q from
an infinite distance away to a point r. Figure 4.4 plots for comparison the electric field and electric
potential for a point charge as a function of the distance from the charge.
Keep in mind: you can only measure differences in electric potential. Some reference
point must always be defined as V = 0. For a point charge, this is r = ∞, for a circuit it
is a specific point in the circuit.
One quick point, to clear up any later confusion: when dealing with point charges like electrons
in electric fields, or atoms in a crystal (e.g., in nuclear or atomic physics, and sometimes inorganic
chemistry), we often use a more convenient unit of energy, the electron volt.
An Electron Volt [eV] is the kinetic energy an electron gains when accelerated through
a potential difference of 1 V.
1 eV = 1.60 × 10−19 C·V = 1.60 × 10−19 J
We will encounter the electron volt more and more as time goes on, it turns out to be quite
convenient when worrying about small numbers of charges.
other. Bringing charges close together means energy is gained or lost to make that happen, and
that energy is the potential energy of the pair of charges – how much energy is tied up in keeping
those two charges where they are. For example, if two positive charges are to be kept close together
against their natural repulsion, energy should be supplied to keep them together. If a positive and
negative charge are to be kept together, energy should be supplied to keep them apart.
Now we see that potential energy really is the energy it takes to configure the system under study.
Figure 4.6 also illustrates the difference between the potential of a the separate point charges, and the
potential energy of the pair of point charges. If q1 is already fixed its position, but q2 is at infinity, the
work that must be done to bring q2 from infinity to its position near q1 is P E = q2 V1 = ke q1 q2 /r12 .
That is what the potential energy is, the energy of this configuration of charges relative to just
having q1 all by itself. If q2 is fixed, it also takes P E = ke q1 q2 /r12 to bring in q1 . Thus, it takes
P E = ke q1 q2 /r12 to build our system of two charges, no matter how we do it:
ke q2 q1
P Etwo charges = P E(1 due to 2) = P E(2 due to 1) = q2 V1 = q1 V2 = (4.15)
r12
As mentioned above, if the charges are of the same sign, P E is positive, and work must be done
by an external force to bring the charges together. If they are of opposite charges, P E is negative,
and negative work must be done to keep the charges from accelerating toward each other as they
are brought together. In other words, work must be done to keep the charges apart. Another way
to view the potential energy of the pair of charges is to think about how much kinetic energy would
be gained if we let one of them loose again. If we have a pair of charges with an electrical potential
energy of, say, 1 J with both charges fixed, the charges can gain between them 1 J of kinetic energy
after being let loose. If one stays fixed, the other gets a full 1 J. If both charges are identical and
both move, they each get 0.5 J.
Figure 4.6: (a) If the charge q1 is removed, a potential ke q2 /r12 exists at point P due to charge q2 (b) Similarly, the
charge q1 gives a potential ke q1 /r12 at point P 0 . (c) Either way we build our system of charges, the potential energy of the
system of two charges is just q2 V1 = q1 V2 , or ke q1 q2 /r12 .
What if we have several charges? Just to be concrete, take the system of three point charges
in Figure 4.7. We can obtain the total potential energy of this system by calculating the P E for
every pair combination of charges and adding the results together. Since potential and potential
energy are scalars, we don’t need to worry about components – this is just an algebraic sum:
q1 q2 q1 q3 q2 q3
P E = P E1&2 + P E2&3 + P E1&3 = P E2&1 + P E3&2 + P E3&1 = ke + + (4.16)
r12 r13 r23
Note that it doesn’t matter what the order we sum them in, or if we transpose the labels –
P E1&2 is the same thing as P E2&1 , and r13 is the same as r31 , just like the example with two
charges above.v
What does this really mean, physically? It is the same whether we have two charges or three
or a million. What we are really summing up is the energy required to build this particular
configuration of charges. Imagine that q1 is fixed at the position shown in Figure 4.7, but that
q2 and q3 are at infinity. The work that must be done to bring q2 from infinity to its position
near q1 is P E1&2 = ke q1 q2 /r12 , which is the first term in Equation 4.16. The last two terms
represent the work required to bring q3 from infinity to its position near q1 and q2 , which involves
the interaction with q1 (the second term in Equation 4.16) and the interaction with q2 (the third
term in Equation 4.16). Compare this with Equation 4.15. Again, the result is independent of the
order in which the charges are moved in from infinity.
We can write this more succinctly as a sum over all the charges:
3 3
1 X X ke qi qj
PE = (4.17)
2 ri j
i=1
j=1
j6=i
1 ke q2 q1 ke q3 q1 ke q1 q2 ke q3 q2 ke q1 q3 ke q2 q3
= + + + + + (4.18)
2 r21 r31 r12 r32 r13 r23
q1 q2 q1 q3 q2 q3
= ke + + (4.19)
r12 r13 r23
Here we color-coded the like terms for clarity. Basically, first we pick some charge j, and sum over
all its pairings with the other charges i, making sure not to pair the charge with itself. Here we
v
If you are into the math, that means we sum over all possible combinations, n Ck , not permutations, n Pk , so we
do not count any pair more than once.
have the factor 12 because the sum as written would count every pair of charges twice – since the
pair 1&3 is the same as the pair 3&1. Think about that for a second, and reassure yourself that
the factor 12 is necessary. (If you are not familiar with summations, don’t worry. We will only ever
deal with a few charges at once.) For any arbitrary number of charges N , we can just change the
limits on the sum:
N N
1 X X ke qi qj
P Etotal = (4.20)
2 ri j
j i6=j
The double-sum notation above means “take the charge j = 1, and sum over all the other charges
i = 2, 3, 4, . . . N , then take the charge j = 2, and sum over the other charges i = 1, 3, 4 . . .N, and so
on, until j = N .” Again, this counts every pair twice, hence the factor 21 .
8p
b is +e, Fig. 4.8 We can readily sum over all the possible
pair interactions in the crystal, after a bit of geometry to
figure out the distances between pairs.
For this crystal, we have 12 pairs of negative charges
Figure 4.8: (a) A crystal consisting of a cube of
−e negative charges, with a single +e charge at the
that are just one edge of the cube apart, twelve pair-
center of the cube. The potential energy of the ar-
ings between negative charges sideways across the cube
rangement of nine charges is a sum over potential
faces, eight pairings between the negative corner charges
energy of all pairs. (b) There are four types of
pairs involved in the sum. 17
and the central positive charge, and four corner to cor-
ner pairings of negative charges. This is illustrated in
Fig. 4.8. Standard geometry tells us that the distance between edge charges is just b, the distance
√
3
√
from corner to center is 2 b, the corner-corner distance across a cube face is 2b, and finally the
√
distance between opposite corner charges is 3b. The sum over all pairs is then:
ke e2 ke e2 ke e2 13.55ke e2
ke (−e · e)
P Ecrystal =8× √ + 12 × + 12 × √ +4× √ ≈ (4.21)
( 3/2)b b 2b 3b b
Figure 4.8 shows where each term in the sum comes from. Though this seems a bit complicated,
think about how hard it would be to compute the forces for every pair of charges and find the
resultant vector force! We would have to do that for every stage of construction of the crystal, a
tedious task at best. The relatively simple potential energy calculation above is a powerful way to
address the amount of energy tied up in maintaining a particular charge distribution.
In this case, note that the total energy of this crystal lattice is positive, representing the fact
that work had to be done on the crystal to assemble it in the first place. Left to its own devices,
the charges in the crystal would want to disassemble. If we did let these charges move apart again,
they would recover the potential energy as kinetic energy and speed away. This makes sense – it is
silly to expect that real crystals are made of mostly negative charges, when we know that they are
neutral overall. In reality, crystals are made of an equal number of positive and negative charges,
which in many cases leads to a negative potential energy, indicating that the charges actually lower
their energy by assembling into a crystal, and therefore favor doing so.
It is also curious that the potential energy sum for our cubic crystal ends up being a constant
factor (about 13.55 times) what it would be for just a single pair of point charges separated by
a distance b. In general, this is true for nearly any crystal lattice we can construct – the energy
will always be some multiple of what for a single pair of charges. The multiple itself – in this case
13.55 – divided by the total number of charges is known as Madelung’s constant, and every sort of
crystal lattice has its own particular Madelung constant. The Madelung constant only depends on
the geometric arrangement of the constituent ions in the crystal structure. Basically, the Madelung
constant is something you look up in a table that takes care of all the nasty summing for you –
someone has already done it! In general we can the potential energy of a crystal like this:
1 ke z 2 e2
P Ecrystal = M N (4.22)
2 r
here M is the Madelung constant, N is the number of charges we are considering, z is the charge of
the ions in the lattice (±1 in this case), and r is their separation. By inspection, you can see that
for our cubic crystal, 13.55 = 12 M N . Since there are N = 9 charges in our example, our Madelung
constant is 2(13.55)/10 = 2.71.
If we take the structure of NaCl (common salt or rocksalt), the so-called face-centered cubic
structure shown in Fig. 4.9a, the Madelung number ends up being about −1.75 if you carefully
take the limit of the sum for very large N . The rocksalt structure has alternating positive Na+ and
Cl− ions, arranged in a face-centered cubic structure. Overall, it is electrically neutral, and the
negative potential energy reflects the stability of the structure. The negative sign shows that
work would have to be done to take the NaCl crystal apart – it is intrinsically stable. This is in
contrast to our ficticious body-centered cubic case above. Since our cubic crystal is mostly made
of negative charges, it is not stable, and work has to be done to assemble it. The NaCl structure,
however, has an equal number of positive and negative charges, and the negative potential energy
sum explains the cohesion of the crystal and the fact that NaCl spontaneously assembles when Na
and Cl are mixed. The Na and Cl constituents can lower their overall energy by assembling into
a crystal, and that is what they do when given half a chance. The more negative the Madelung
constant, the more stable the crystal is, if everything else is the same.
As another example, consider the Rutile (TiO2 )
structure in Fig. 4.9b. In this case, the Madelung (a) (b)
number is −4.82, suggesting that rutile structure
materials should be quite stable, and they generally
are. There is one problem with all of this, however.
Based on the analysis above, shrinking the distance
b between charges in the crystal should make the po-
O2-
tential energy even more negative. In other words, Na+ Ti 4+
Cl-
the smaller the spacing, the more stable the crystal
would be. If that were true, why would the crystal
Figure 4.9: (a) The NaCl or rocksalt structure. There
not just keep shrinking shrinking until it collapsed? are an equal number of N a+ and Cl− ions, the crystal is
In fact, it can be shown that no system of stationary neutral overall. (b) The Rutile (TiO2 ) structure. There
as many O2− ions as Ti4+ to maintain neu-
charges can be in a stable equilibrium according to are twice 17
trality.
classical physics. We need quantum physics to ex-
plain why, e.g., salt crystals do not spontaneously shrink, and how crystals are stable in the first
place.
So the work done on a charge by an electric force is related to the change in electric potential
energy of the charge. We also know that the change in electric potential energy between points
A and B must be related to the potential difference between those two points. Putting these two
facts together, we can easily relate work and potential difference:
−W = ∆P E = q (VB − VA ) (4.23)
In Chapter 3, we said that for a conductor in electrostatic equilibrium, net charge resides only
on the conductor’s surface. Moreover, we said that the electric field just outside the surface of the
conductor is perpendicular to the surface, and that the field inside the conductor is zero. This also
means that all points on the surface of a charged conductor in electrostatic equilibrium
are at the same potential.
+
+
+
+
+ Figure 4.10: An arbitrarily shaped conductor carrying a positive
+
+
+++++
+
A
charge. When the conductor is in electrostatic equilibrium, all of the
+
charge resides at the surface, E~ = 0 inside the conductor, and the di-
+
B
+
+
+
+
~ just outside the conductor is perpendicular to the surface.
rection of E
+
The electric potential is constant inside the conductor and is equal to
+
+
+
+ the potential at the surface. Note from the spacing of the positive signs
+
+
+
that the surface charge density is nonuniform.
Equation 4.23 gives us a very general result: no net work is required to move a
charge between two points which are at the same electric potential. Mathe-
matically, W = 0 whenever VB = VA .
Consider the path connecting points A and B along the surface of the conductor in Figure 4.10.
If we move only along the conductor’s surface, the electric field E ~ is always perpendicular to our
path. Since the electric field and displacement are always perpendicular, no work is done when
moving along the surface of a conductor. Equation 4.23 then tells us that if the work is zero, points
A and B must be at the same potential, VB −VA = 0. Since the path we have chosen is completely
arbitrary, this means it is true for any two points on the surface.
Of course, this only holds for perfect conductors. If other dissipative (or non-conservative) forces
are present, this is not true, and work is required to move the charge in the presence of a dissipative
force. The electrical analog of friction or viscosity is resistance, which will be treated in the next
chapter.
q q -q q q
Figure 4.11: The blue lines electric field lines, and the red lines are equipotential surfaces for (a) a single point charge,
(b) an electric dipole, and (c) two like charges. In each case, the equipotential surfaces are perpendicular to the electric field
lines at every point. (Again, arrows are left off of the field lines for simplicity. Equipotential lines do not need arrows, since
potential is a scalar.)
q conductor
E=0
Figure 4.12: The blue lines electric field lines, and the red lines are equipotential surfaces for left a conducting sphere near
a point charge q, and right a point charge suspended above a long grounded conducting plate.
More examples are given in Fig. 4.12, which include conductors. For a conductor, we know the
electric field inside is zero, and the electric potential is constant. Add to this the fact that electric
field lines and equipotential lines are always perpendicular where they meet, and you should be
able to explain all of the examples shown here. This why in the right-hand example, a single charge
above a ground plane, the electric field lines all intersect the ground plane at perfect right angles,
and in the left-hand example, there are no lines inside the conducting sphere. Compare these figures
with Fig. 3.9 – the relationship between electric field lines and equipotential lines should be clear.
Appendix B might give you a bit more insight as to why the electric field lines and equipotential
lines behave the way they do. Recall from Sect. 3.5 that a conductors are mirrors for electric field
lines, the same is true for the equipotential lines.
How do we actually change the potential or voltage of one object relative to another? Charging by
induction or conduction are two ways, but somewhat cumbersome. A device known as a voltage
source is a circuit element with two terminals, where a constant voltage difference is supplied
between these two terminals. Whatever you connect to the “negative” terminal of the voltage
source will have a voltage ∆V lower than the “positive” terminal. Using a “ground” point (recall
Sect. 3.2.1.1), one can also experimentally define one of the terminals as V = 0. If we “ground” the
negative terminal, then the negative terminal is Vneg = 0, and the positive terminal has Vpos = ∆V .
We will see much more of this in the coming chapters, and it will begin to make more sense!
Batteries are one example of a constant voltage source, which we will cover in more detail in
Chapter 6, and the wall outlets in your house are another example of a voltage source (though this
voltage is not strictly constant, see Chapter 9). Ideal textbook voltage sources always supply a
constant potential difference, ∆V . Real voltage sources always have restrictions, a primary one being
the amount of power that can be sourced. Below are circuit diagram symbols for constant voltage
sources: the first two represents batteries, the last is a generic symbol for any more complicated
sort of voltage source:
+ – + –
Batteries:
General constant voltage source: +–
Now that we know a bit about voltage and conductors, we are moving closer to being able
to describe simple electric circuits. Presently, we will introduce our first real circuit element, the
capacitor.
4.6 Capacitance
A capacitor is an electronic component used to store elec-
tric charge, it is used in essentially any electric circuit you can
name. Capacitors are at the heart of both Random Access
Memory (RAM) and flash memory, besides being crucial for
nearly any sort of power supply. It is one of the fundamen-
tal building blocks for electronics, and the first we will meet.
Figure 4.13 shows a typical design for a capacitor – two metal
plates with some special stuff in between. It is hard to be-
lieve complicated devices like computers rely on such a simple
construction, but it is true!
A typical capacitor consist of two parallel metal plates, sep-
arated by a distance d. When used in a circuit, the plates are Figure 4.13: A parallel-plate capacitor con-
sists of two conducting plates of area A, sep-
connected to the positive and negative terminals of a voltage
arated by a distance d. The capacitance of
source such as a battery. An ideal voltage source insists that this structure is C = 0 A/d.
the two plates have a voltage difference of ∆V , and this has
the effect of pulling electrons off of one plate, leaving it with a net positive charge +Q, and trans-
ferring these electrons to the second plate, leaving it with a net negative charge −Q. The charge on
both plates is equal, but opposite in sign. Essentially, putting the two plates at different potentials
means electrons want to migrate to the plate with higher potential, and leave the plate with lower
potential deficient.
The transfer of charge between the plates stops when the potential difference across the plates
is the same as the potential difference of the voltage source. The capacitor stores this potential
difference, and hence stores electrical energy, until some later time when it can be reclaimed for a
specific application. You can think of this as energy storage from one point of view, or a time-delayed
response from another.
Keep in mind (again): you can only measure differences in electric potential. Some reference
point must always be defined as V = 0. In the case of the capacitor connected only to a battery
(without any ground points), the potential is zero half way between the two plates.vi
Definition of Capacitance:
The capacitance C is the ratio of the charge stored on one conductor (or the other) to
the potential difference between the conductors:
Q
C ≡ (4.24)
∆V
C is always positive, and has units of farads [F], or coulombs per volt [C/V].
The capacitance of a particular arrangement of two conductors depends on their geometry and
relative arrangement. One common (and simple) structure is the parallel plate capacitor, as
shown in Figure 4.13. In Chapter 3, we stated without proof (but not without good reason) that
the electric field between two parallel plates is constant. But what is the field in between the plates?
First, we assume that the two plates are identical, such that they have the same charge on them
– one has +Q and one has −Q. Second, we assume the plates area A is large compared to their
spacing d, such that we can ignore the edge regions where the field “fringes” (see Fig. 3.9 and 4.14).
Finally, we will connected the plates to a battery with total voltage V .
In Sect. 3.8.4, we found that the electric field above a flat conducting plate is given by E = σE /0 ,
where σE is the charge per unit area on the plate. Since the total charge on each plate is just Q,
the charge per unit area is σE = Q/A, and Q = σE A. This leads us to a more useful expression for
Q
the field: E = A 0
. Again, this is not valid near the edges of the plates where the field is not really
constant.
vi
The potential is also zero infinitely far away of course, but this is hardly useful or reassuring when wiring a
circuit.
Now where the field is constant, we know that the potential difference between the two plates
is ∆V = Ed, where d is the distance between the two plates. Combining this with the facts above,
we can find the capacitance of the parallel plate capacitor from Equation 4.24:
Q σE A σE A σ
EA A
C= = = = = 0 (4.25)
∆V Ed (σE /0 ) d (
σ/
E 0 ) d d
A
C = 0 (4.26)
d
where d is the spacing between the plates, and A is the area of the plates.
We can see from Equation 4.26 that capacitors can store more charge when the plates become
larger. The same is true when the plates get close together. When the plates are closer together,
the opposing charges exert a stronger force on each other, allowing more charge to be stored on the
plates. From Equation 4.24, a capacitor of value C at a potential difference of ∆V stores
a charge Q = C∆V .
Figure 4.14 shows more realistic field lines for a parallel plate capacitor. In between the two
plates, the field is very nearly constant, but much less so near the edges of the plates. So long as
the plates are relatively large compared to their separation, we can for practical purposes ignore
this complication, and our capacitance calculated from Eq. 4.26 will be very accurate.
V+
V-
Capacitors form the basis for several types of Random Access Memory (RAM) in modern com-
puters. Dynamic random access memory (DRAM) is one type of random access memory that
stores each bit of data in a separate capacitor. One capacitor in a DRAM structure holds one bit
of information (a “1” or a “0”). When the capacitor has charge stored in it, the bit is a “1,” and
when there is no charge stored the bit is a “0.” Flash memory works in a roughly similar manner.
∆V i
the work required to move the charge onto the plates. If Figure 4.15: Each bit of charge ∆Qi transferred
a capacitor is initially uncharged (both plates neutral), through a voltage ∆Vi contributes a bit of potential
energy P Ei = ∆Vi ∆Qi . Summing all those contri-
very little work is required to move a charge ∆Q from butions to get the total energy stored is the same
one plate to another across the separation d. As soon as finding the total area of the shaded region. If we
make ∆Vi and ∆Qi tiny enough, the area is basi-
as this charge is moved, however, a potential difference cally a triangle, and in total P E = 12 Q∆V .
∆V = ∆Q/C appears between the plates. This potential
difference means that work must be done to move additional charges onto the plates. Combining
what we know so far, and assuming a constant electric field between the plates, the work that needs
to be done to move the first bit of charge ∆Q has to be:
∆P E = −∆W (4.27)
= ∆Q · E∆x (4.28)
= ∆Q · E d (4.29)
1
= ∆QσE d (4.30)
0
d
∆P E = ∆Q∆Q (4.31)
A0
(∆Q) (∆Q)
∆P E = (4.32)
C
vii
I once burned a small hole in my thumb by accidentally discharging a high-voltage capacitor across it while
repairing a TV, for example. Capacitors can store dangerous amounts of energy if released at the wrong time!
If we keep doing this with more and more ∆Qs, until we build up the total charge Q, we can find
the total work. As illustrated in Fig. 4.15, each little bit of charge ∆Qi adds a bit of potential
energy ∆Vi ∆Qi . If we sum up all those contributions, we are really just finding the shaded area
of the triangle on the graph. The area of a triangle is just 21 (base)(height), so the total change in
potential energy is just:viii
1
|W | = |∆P E| = Q∆V (4.33)
2
Remember that Q = C∆V must still be true, so we can write the energy stored in the capacitor in
three different ways, as shown below (noting that energy stored = work done). For example, you
can verify that a 5 µF capacitor charged with a 120 V source stores 3.6 mJ (3.6×10−3 J).
1 1 Q2
Energy stored = Q∆V = C (∆V )2 = (4.34)
2 2 2C
Is there an analogy for electrical energy storage? One way to store gravitational energy is simply
to pump a large mass m of water up to a height ∆y, see Figure 4.16. Releasing the water at a later
time releases the stored potential energy mg∆y, which could be used to, e.g., rotate a turbine. In
fact, this is one way to store excess energy generated at off-peak times in power plants for later
reclamation.
devices. As we will find out later, they can also used to filter out high- and low-frequency signal
selectively. The circuit diagram symbol for a capacitor is a reminder of the parallel plate geometry:
Capacitors are manufactured with standard values, and by combin- Figure 4.18: A picture of18 several
common types of capacitors.
ing them in different ways, any non-standard value of capacitance
can be realized. Figure 4.19 shows a parallel arrangement of capac-
itors. The left plate of each capacitor is connected by a wire (black
lines) to the positive terminal of a battery, while the right plate of each capacitor is connected to
the negative terminal of the battery.ix This means that the capacitors in parallel both have
the same potential difference ∆V across them, the voltage supplied by the battery.
When the capacitors are first connected, electrons leave the positive plates and go to the negative
plates until equilibrium is reached - when the voltage on the capacitors is equal to the voltage of
the battery. The internal (chemical) energy of the battery is the source of energy for this transfer.
In this configuration, both capacitors charge independently, and the total charge stored is the sum
of the charge stored in C1 and the charge stored in C2 . We can write the charge on the capacitors
using Equation 4.24:
Q1 = C1 ∆V
Q2 = C2 ∆V
Qtotal = Q1 + Q2 = C1 ∆V + C2 ∆V = (C1 + C2 ) ∆V
What this equation shows is that two capacitors in parallel behave as one single capacitor
with a value of C1 + C2 . In other words, “capacitors add to each other in parallel.” We call C1 +C2
the “equivalent capacitance”, Ceq = C1 +C2
Ceq = C1 + C2 (4.35)
Ceq = C1 + C2 + C3 + . . . (4.36)
The key point for capacitors in parallel is that the voltage on each capacitor is the same. One
way to see this is that they are both connected to the battery by the same perfect wires, so they
pretty much have to have the same voltage. This is true in general, as we will find out, so long as
we have perfect textbook wires. It follows readily that the equivalent capacitance of a parallel
combination is always more than either of the individual capacitors.
Figure 4.20a shows the second simple combination, two capacitors connected in series. For series
capacitors, the magnitude of charge is be the same on all plates. Consider the left-most
plate of C1 and right-most plate of C2 in Figure 4.20. Since they are connected directly to the
battery, they must have the same magnitude of charge, +Q and −Q respectively.
Since the middle two plates (the right plate of C1 and the left plate of C2 ) are not connected
to the battery at all, together they must have no net charge. On the other hand, the left and right
ix
In circuit diagrams like these, the wires are assumed to be perfect.
plates of the same capacitor have to have the same magnitude of charge, so this means all plates
have a charge of either +Q or −Q stored on them. All of the right plates have charge −Q,
and all the left plates have a charge +Q
Can we reduce this series combination to a single equivalent capacitor, like we did for the parallel
case? Sure, with a little math. A single capacitor equivalent to the series capacitors, Figure 4.20b,
must have a charge of +Q on its right plate, and −Q on its left plate, so the total charge stored is
still ±Q on each plate.. Further, it must have a potential difference equal to that of the battery,
∆V . Using Equation 4.24:
Q
∆V = (4.37)
Ceq
Q Q
∆V1 = ∆V2 = (4.38)
C1 C2
Conservation of energy requires that all of the potential difference of the battery ∆V be
“used up” somewhere. Since our wires are assumed to be perfect, the only place the potential
can go is onto the capacitors. Therefore, for the series case the voltage on C1 and C2 must together
total that of the battery:
Q Q Q
∆V = = + (4.40)
Ceq C1 C2
Canceling the Q’s, we can come up with the equivalent capacitance for series capacitors:
1 1 1
= + (4.41)
Ceq C1 C2
1 1 1 1
= + + ... (4.42)
Ceq C1 C2 C3
It follows that the equivalent capacitance of a series combination is always less than
either of the individual capacitors. The key point for capacitors in series is that the charge
on each capacitor is the same, and the same as the charge on the equivalent capacitor.
The easiest way to see how one can use the rules for series and parallel capacitors to reduce any
complex combination of capacitors to a single equivalent capacitor is by example. For example,
consider the combination of capacitors in Figure 4.21 below.
20u 3u C2 C4 C3 20u
= = = =
6u 20u 6u 20u
+ - + - + - + -
C3 20u C4 C2 20u 3u
= =
6u 20u 6u 20u
+ - + - + - + -
(a) (b)
Figure 4.21: (a) Reducing the complex combination to a single equivalent capacitor. (b) Working backwards to find the
charge on each capacitor.
Finding the equivalent capacitor First, we notice from Figure 4.21a that the only purely
series or parallel combination to start with is the 20 µF and 3 µF capacitors in series. We can
combine those into an equivalent capacitance, C2 , using Equation 4.41:
1 1 1
= + (4.43)
C2 20 µF 3 µF
1 3 · 20
C2 = 1 1 = (4.44)
20 µF + 3 µF
3 + 20
C2 = 2.6 µF (4.45)
Redraw the circuit to reflect this change, and we arrive at the second diagram in Figure 4.21a. Now
we have the equivalent capacitor C2 purely in parallel with the 6 µF capacitor. Using Equation 4.35,
we can combine those two into another equivalent capacitance C3 :
C3 = C2 + 6 µF = 8.6 µ F (4.46)
Redraw the circuit, and we arrive at the third diagram in Figure 4.21a. Now we only have C3 in
parallel with 20 µF left, which we can now combine into a final overall equivalent capacitance C4 .
1 1 1
= + or C4 = 6.02 µF (4.47)
C4 C3 20 µF
So the equivalent capacitance of the four capacitors we started with is about 6 µF.
Finding the charge on each capacitor Now we have to work backwards from our single
equivalent capacitor and deduce the charge and voltage on each individual capacitor, following
Figure 4.21b. First, we know the charge on C4 , the equivalent capacitor, once we know the value
of C4 (above) and ∆V (given):
Now C3 and the 20 µF are in series. Two series capacitors must both have the same charge but
different voltages. Further, the charge on series capacitors is the same as the charge on the equivalent
capacitor. Therefore, both the 20 µF and C3 have to have the same charge that C4 has. So
Q3 = Q20 µ = Q4 = 90.3 µC
Now we get to the third diagram. We know that the 6 µF and C2 together have Q4 worth of charge.
Parallel capacitors both have the same voltage, but different charges. If we call the voltage on these
two capacitors V , the charge on the 6 µF is 6 µF·V , and the charge on C2 is C2 · V , which gives us
Q4 :
Note that the voltage V and the voltage on the lower 20 µF capacitor must together equal the
battery voltage, so the voltage on the lower 20 µF capacitor must be 15.00 − 10.47 = 4.53 V. Now
for the last step. You now know the charge on C2 , which is the same as the total charge on the
20 µF and 3 µF capacitors. Since they are in series, they both have the same charge, and the both
have to have Q2 . Thus Q3 µ = Q20 µ = 27.4 µC. We can find the voltage on each by noting that
Q3 µ Q20 µ
V3 µ = = 9.13 V and V20 µ = = 1.37 V
C3 µ C20 µ
Further, we know that that V3 µ + V20 µ has to equal the voltage on the equivalent capacitor C2 , viz.
10.47 V. So, in the end, the charge on the 20 µF is the same as that on the effective capacitance,
the charge on 20 µF and the 3 µF are the same, and the charge on the 6 µF is about halfway in
between either of those. The charge, capacitance, and voltages are summarized in Table 4.1.
Table 4.1: Equivalent capacitances, charges, and voltages for Figure 4.21.
∆V0 = Q0 /C0 . If we now insert the dielectric, the voltage is reduced to:
∆V0 ∆V0
∆V = = (4.48)
κ r
What happens is that part of the potential difference originally across the plates of the capacitor
is now spent on the dielectric itself. Being an insulator, the dielectric can support regions of charge,
unlike a conductor. When it is inserted into the capacitor, the part of the dielectric near the +Q0
plate builds up a partial negative charge in response, and the part near the −Q0 plate builds up a
partial positive charge. This has the effect of “canceling” part of the +Q and −Q charges on the
plates, so the battery supplies more charges to compensate! This goes on until an equilibrium is
reached, and the dielectric can steal no more charge.
In the end, since the dielectric “steals” a bit of extra charge, the capacitor with a dielectric inside
stores more charge than the capacitor without the dielectric. The total amount of charge present,
including the “extra” bit “stolen” by the dielectric, is proportional to κ, so the capacitance of the
new structure is increased by a factor of κ:
Q0 κQ0 r Q0
C= = = (4.49)
∆V ∆V0 ∆V0
A A
C = κ0 = r 0 (4.50)
d d
the dielectric increases the capacitance by a factor κ, the dielectric constant. The dielec-
tric constant is also sometimes called r .
This is not an insignificant effect - the value of κ can range from ∼ 1 for air to a few thousands –
adding a good dielectric layer can increase the amount of charge stored by hundreds or thousands!
For vacuum, the value is exactly 1, so Equation 4.50 just reduces to Equation 4.26. The value of κ
is always greater than 1 (κ > 1), so the capacitance always increases when a dielectric is included.
Why this is true microscopically is treated in the next section. Table 4.2 lists the dielectric constants
for a few common materials.
This trick for making larger capacitors does not work indefinitely. Every dielectric has a “dielec-
tric strength,” the maximum tolerated value of the electric field inside that particular material. If
the electric field inside the dielectric exceeds this value, the dielectric breaks down, which usually
means a spark jumps across (or through) it. Exceeding the dielectric strength is a catastrophic
failure, and usually results in “magic smoke” being released from the device in question.
Material κ Material κ
Vacuum 1
Air 1.00054 Teflon
R
2.1
Polyethylene 2.25 Paper 3.5
Silicon dioxide 3.7 Pyrex 4.7
Rubber 7 Methanol 30
Silicon 11.68 Water (distilled) 80.1
SrTiO3 310 BaTiO3 ∼ 1000
Somehow or another, dielectrics inside a capacitor are able to dramatically increase the amount of
charge that can be stored and decrease the voltage across the capacitor. Our explanation so far
is that the dielectric itself partly charges, which both increases the amount of charge stored and
decreases the net voltage. How does this work? In order to understand what is really going on, we
have to think a bit about the microscopic nature of the dielectric.
The dielectric itself contains a large number of atomic
nuclei and electrons, but overall there are equal numbers (a) (b)
+
(c)
+ +
of positive nuclei and electrons to make the dielectric over- - -
+
+
- -
all neutral. We have said that charges in insulators are +
+
- + -
not mobile, so electrons and nuclei remain bound. What, - +
-
+
+
- !
-
being bound, both electrons and nuclei in a dielectric can E
move very slightly without breaking their bonds. Elec- Figure 4.23: (a) Atoms and many nuclei have
trons will attempt to move in the direction opposite the no net charge separation without an electric field
present. (b) Some “polar” molecules have a per-
electric field between the plates, and nuclei will attempt
manent electric dipole moment. Usually, these
to move in the opposite direction. As a result, tiny dipoles moments are oriented randomly from molecule to
are formed inside the dielectric, which will be aligned molecule, and the net moment is zero. (c) In an
electric field, non-polar molecules can have an in-
along the direction of the electric field (see Figure 4.23). duced dipole moment, due to electrons and nuclei
Random thermal motion of the atoms or molecules will wanting to move in opposite directions in response
to the field. Permanent dipoles remain bound, but
limit the degree of alignment to an extent. In most ma- can move or rotate slightly to align with the elec-
terials the degree of alignment and the induced dipole tric field. Either way, an overall dipole moment
results.
strength are directly proportional to the external electric
field. Essentially, an electric field induces a charge separation within the atom or molecule.
Some molecules have a natural charge separation or dipole moment already built in, so-called
polar molecules such as water or CO2 . In these kinds of dielectrics, the built-in dipole moments
are usually randomly aligned, and cancel each other out overall. An electric field exerts a torque
on the dipoles, which tries to orients them along the electric field. Once again, random thermal
motion works against this alignment, but the overall effect of the electric field is a net alignment,
the degree of which is proportional to the applied electric field. Thus, in both polar and non-polar
dielectrics, there is a net orientation of dipoles when an electric field is applied. The net dipole
strength is far stronger in polar materials, and in the rest of the discussion below we will assume
that our dielectric is made of polar molecules.
Now, what happens when we place our dielectric between two conducting plates? With no
voltage applied between the plates, there is no electric field, and the tiny dipoles are randomly
oriented, Fig. 4.24b. Once a voltage is applied to the plates, a constant electric is created between
them, which serves to align the dipoles, Fig. 4.24c. The net alignment of dipoles within a dielectric
leads to the surfaces of the dielectric being slightly charged, Fig. 4.24. Within the bulk of the
dielectric, dipoles will be aligned head-to-tail, and their electric fields will mostly cancel (Fig. 4.24a).
At the surfaces of the dielectric, however, there will be an excess of positive charge on one side, and
an excess of negative charge on the other. In this situation, the dielectric is said to be polarized.
The dielectric is still electrically neutral on the whole, an equal number of positive and negative
charges still exist, they have only separated due to the applied electric field.
- + + +
- - the dielectric picks up an induced charge on
-
+
! dipoles
+
E
-
-
- - -
+
The electric field due to these effective sheets of charge is opposite that of the applied electric
field, and thus the total electric field – the sum of the applied and induced field – is smaller than
if there were no dielectric. Thus, the dielectric reduces both the applied voltage and the electric
field. The electric field due to the oriented dipoles inside the dielectric is usually proportional to
where the constant of proportionality χE is called the electric susceptibility. It represents the relative
strength of the dipoles within the material, or more accurately, how easily a material polarizes in
response to an electric field. The total electric field the dipoles experience is not just the field due
to voltage applied across the plates, but must also include the field of all the other dipoles as well:
1 1 ∆V0
∆Vtotal = Eplates d = ∆V0 = (4.56)
1 + χE 1 + χE κ
here we again use ∆V0 for the voltage on the plates without the dielectric. This result agrees
precisely with Eq. 4.48, if we make the substitution κ = 1+χE , as we have in the last term in the
equation above. We can go further and calculate the capacitance, just as we did for Eq. 4.50:
A A
C = (1 + χE )0 = κ0 = κC0 (4.57)
d d
where C0 is the capacitance without the dielectric present. Thus, our “dielectric constant” is
simply related to the dielectric susceptibility, the ability of the dielectric to polarize in response to
an electric field. This makes sense in a way – the more easily polarized the dielectric, the more
easily it affects the capacitance. Also, since κ = 1 for vacuum, χE = 0, which also makes sense as the
vacuum is not polarizable (so far as we know). The result we obtain using this more sophisticated
model is exactly the same as earlier, but now we have a plausible microscopic origin for the effect
of dielectrics in capacitors, and we know why the electric field and voltage are reduced, and the
capacitance increased.
Charge
Potential difference
Energy stored
None of the above
2. An ideal parallel plate capacitor is completely charged up, and then disconnected from a
battery. The plates are then pulled a small difference apart. What happens to the capacitance,
C, and charge stored, Q, respectively?
decreases; increases
increases; decreases
decreases; stays the same
stays the same; decreases
3. An isolated conductor has a surface electric potential of 10 Volts. An electron on the surface
is moved by 0.1 m. How much work must be done to move the charge? (e is the electron charge.)
1e Joules
0.1e Joules
10e Joules
0
5. A parallel plate capacitor is shrunk by a factor of two in every dimension – the separation
between the plates, as well as the plates’ length and width are all two times smaller. If the
original capacitance is C0 , what is the capacitance after all dimensions are shrunk?
2C0
1
2 C0
4C0
1
4 C0
(a)
7. A capacitor with air between its plates is charged to 120 V and then disconnected from the
battery. When a piece of glass is placed between the plates, the voltage across the capacitor
drops to 30 V. What is the dielectric constant of the glass? (Assume the glass completely fills
the space between the plates.)
4
2
1/4
1/2
4.9 Problems
1. Electrons in a TV tube are accelerated from rest through a potential difference of 2.00×104 V
from an electrode towards the screen 25.0 cm away. What is the magnitude of the electric field,
if it assumed to be constant over the whole distance? You may assume that the electron moves
parallel to the electric field at all times.
2. A proton moves 1.5 cm parallel to a uniform electric field of E = 240 N/C. How much work
is done by the field on the proton?
3. It takes 3 × 106 J of energy to fully recharge a 9 V battery. How many electrons must be
moved across the 9 V potential difference to fully recharge the battery?
20uF 3uF
5. Calculate the speed of a proton that is accelerated from rest through a potential difference
of 104 V.
6. A proton at rest is accelerated parallel to a uniform electric field of magnitude 8.36 V/m
over a distance of 1.10 m. If the electric force is the only one acting on the proton, what is its
velocity in km/s after it has been accelerated over 1.10 m?
q1
8. Two identical point charges +q are located on the y axis at y = +a and y = −a. What is the
electric potential for an arbitrary point (x,y)?
6µF 20µF
9. What is the equivalent capacitance for the five capac-
itors at left (approximately)?
-+
ΔV
(a) (b)
12. A parallel plate capacitor has a
capacitance C when there is vacuum be-
tween the plates. The gap between the
κ plates is half filled with a dielectric with
κ
dielectric constant κ in two different ways,
as shown below. Calculate the effective
capacitance, in terms of C and κ, for both
situations. Hint: try breaking each situa-
tion up into two equivalent capacitors.
1. Potential difference.
2. Decreases; stays the same. The capacitance of a parallel plate capacitor is C = 0dA . If we
pull the plates apart and increase the spacing d, the capacitance decreases. Nothing happens to
the charges already on the plates if the capacitor is disconnected, though – they have no where
to go!
3. 0. The charge is moved along the surface of the conductor, which is always at the same
electric potential. Since the charge has moved through no net potential difference, no work has
been done.
4. KEe = KEp . All of the potential energy gained by the proton and electron has to be
converted into kinetic energy, and both particles lose the same potential energy by moving
through the potential difference. Both particles have equal but opposite charges and move
through equal and opposite potential differences – since the negatively charged electron moves
through a positive potential difference, and the positively charged proton moves through a
negative potential difference, the net loss of potential energy q∆V is the same. Therefore, the
amount of kinetic energy gained by each particle is the same. Since both particles started at
rest, their resulting kinetic energies have to be the same. The velocity of the electron will be
much greater, however, owing to its smaller mass – recall that kinetic energy is 12 mv 2 .
5. 12 C0 . The capacitance of a parallel plate capacitor whose plates have an area A and a
separation d is C = 0dA . If we imagine the plates to be rectangular of length l and width w,
the area A is A = lw. Let the capacitance of the capacitor be C0 = 0dlw before dimensions are
shrunk. Once we reduce the length, width, and separation by two times, we have:
1 1
0 21 lw
0 2 l 2 w 1
C= 1 = = C0
2d
d 2
It is easy to prove that if we chose, e.g., circular plates, the answer would be the same – for
any reasonable shape, the area goes down as the square of the dimensional decrease, while the
separation just goes down as the factor itself.
6. This is probably another question most easily answered by elimination. In (a), the charges
are clearly of the same magnitude, since the graph is perfectly symmetric, while in (b) the
charges must be of different magnitude to explain the asymmetric graph. Therefore, the third
answer cannot be correct.
In (a), the potential is constant along a vertical line separating the two charges (since there is
a perfectly vertical line running halfway between the charges). This would only be true if they
are of opposite signs. If the charges were of the same sign, there would be equipotential lines
running horizontally from charge to charge. Similarly, the charges must also be of opposite sign
in (b). This also rules out the first answer.
Based on similarity of (a) and (b), it must be that if (a) has charges of opposite magnitude,
then so does (b). This also means that the fourth answer is out, which leaves only the second
answer as a possibility. If you are still not clear on why the correct answer must be the second
one, you may want to look carefully at the examples of equipotential lines in different situations
presented in this chapter.
7. 4. Without the piece of glass, our capacitor has a value we’ll call C. The charge stored on
the capacitor is Q = CV = 120C when the initial voltage is Vinitial = 120 V. The piece of glass
acts as a dielectric, which increases the capacitance to κC (κ is always greater than 1).
Since the battery was disconnected, after inserting the piece of glass the total amount of charge
Q stays the same - there is no source for additional charge to enter the capacitor. Now, however,
the voltage Vfinal is less and the capacitance is more. We can set the initial amount of charge
before inserting the glass equal to the final charge after inserting the glass, and solve for κ:
Q = CVinitial = κCVfinal
or CVinitial = κCVfinal
=⇒ Vinitial = κVfinal
Vinitial 120
κ = = =4
Vfinal 30
1. 8.00 × 104 V/m. In a constant electric field, the electric field, potential difference and
displacement are related by:
~ x| cos θ
∆V = −|E||∆~ (4.58)
Since the displacement and electric field are parallel everywhere, θ = 0, and we have just ∆V =
E∆x. We have a potential difference ∆V = 2×104 V developed over a displacement of ∆x = 25 cm
(0.25 m). Plugging in the numbers:
∆V = −E∆x (4.59)
2 × 104 V = −E (0.25 m) (4.60)
2 × 104 V
=⇒ E = − = −8.00 × 104 V/m (4.61)
0.25 m
Since we want only the magnitude of the electric field, it is sufficient to write 8.00 × 104 V/m.
2. 5.8 × 10−19 J The work done in moving a single charge through a constant electric field is
given by:
W = qEx ∆x (4.62)
where Ex is the component of the electric field parallel to the displacement. In this case,
the displacement is always parallel to the electric field, so Ex is just the total field and ∆x
the displacement. Now we just plug in the numbers, remembering to put the displacement in
meters:
W = qE∆x (4.63)
= 1.6 × 10−19 C (240 N/C) (0.015 m)
(4.64)
≈ 5.8 × 10−19 N · m = 5.8 × 10−19 J (4.65)
In the last line we used the fact that one Joule is defined to be one Newton times one meter.
3. 2×1024 electrons. The energy required to charge the battery is just the amount that the
potential energy of all the charges changes by. Each electron is moved through 9 V, which means
each electron changes its potential energy by −e · 9 V, where e is the charge on one electron.
The total potential energy is the potential energy per electron times the number of electrons,
n. Basically, this is conservation of energy: the total energy into the battery has to equal the
amount of energy to move one electron across 9 V times the number of electrons.
∆Ein + ∆P E = 0
6
3.6 × 10 J + n(−e · 9 V) = 0
ne · 9 V = 3.6 × 106 J
3.6 × 106 J
n =
e · 9V
3.6 × 106 J
=
(1.6 × 10−19 C) (9 V)
3.6 × 106
=
(1.6 × 10−19 ) (9)
≈ 2 × 1024
Here we make use of the fact that Coulombs times Volts is Joules. As usual, if you just use
proper SI units throughout, the units will work out on their own.
4. 6.02 µF. See page 128, this is the same capacitor layout!
5. 1.41 × 105 m/s. When the proton is accelerated through a potential difference ∆V , it loses
a potential energy of e∆V , which is converted into kinetic energy. We only need to apply
conservation of energy, noting that the proton started at rest, and choosing our zero of potential
energy such that the final potential energy is zero:
Einitial = Efinal
KEinitial + P Einitial = KEfinal + P Efinal
1
0 + q∆V = mp vf2 + 0
2
2q∆V
=⇒ vf2 =
mp
s s
2q∆V 2 · 1.6 × 10−19 C · 104 V
vf = =
mp 1.67 × 10−27 kg
1
≈ 1.41 × 10−5 [C · V/kg] 2
1 1
1.41 × 10−5 [J/kg] 2 = 1.41 × 10−5 kg · m2 /s2 · kg 2
=
= 1.41 × 10−5 m/s
The units are a bit tricky here, but remember that if you keep everything in proper SI units from
the start, they will always work out ok. Remember from the definition of electrical potential
that one Volt is equal to one Joule per Coulomb, 1 V = 1 J/C – it then follows that 1 C · V = 1 J.
6. 42.0 km/s. Of course, 42 is the answer to life, the universe, and everything.x
Anyway. The proton starts from rest, and hence has no kinetic energy. It is accelerated by
an electric field, and thus gains kinetic energy. The kinetic energy gained must come from the
electric field. A charge q moving parallel to a constant electric field E over a distance ∆x changes
its potential energy by:
∆P E = qE∆x
The charge on a proton is just +e, and E and ∆x are given. The change in kinetic energy is
just the final kinetic energy of the proton, since it started from rest. The gain in kinetic energy
must equal the change in potential energy:
x
From Hitchhiker’s Guide to the Galaxy ... there are often nerd jokes on physics exams.
s
2 (1.6 × 10−19 C) (8.36 V/m) (1.10 m)
vfinal =
1.67 × 10−27 kg
p
≈ 42000 C · V/kg
p
= 42000 J/kg
s
kg · m2
= 42000
s2 · kg
= 42 km/s
Making absolutely sure that the units work out, one should note that Coulombs times Volts is
Joules, or kg·m2 /s2 . If you always use proper SI units, it will work out though, and you won’t
have to remember lots of unit conversions.
q1 q1 q2
P E(1,2) = ke q2 2 = ke r 2
r12 12
Here r12 is the separation between charges 1 and 2, or just 1.0 m in this case. We do the same
for the other two pairs of charges, and add all three energies together (being very careful with
signs):
ke q ke q
8. √ +√ . For this one, it is perhaps easier to draw ourselves a picture:
x2 +(a−y)2 x2 +(a+y)2
We will label the upper charge 1, and the lower charge 2. The principle of superposition tells
us that we only need to find the potential at point (x, y) due to each separately, and then add
the results together. First, we focus on charge 1, located at (0, a). First, we need the distance
d1 from charge 1 to the point (x, y). The horizontal distance is just x, and the vertical distance
(0,a) + q d1
(x,y)
d2
(0,-a) + q
q
2
d1 = x2 + (a − y) (4.66)
The potential due the first charge, which we’ll call V1 is then found from Eq. 4.14:
ke q ke q
V1 = =q (4.67)
d1 2
x2 + (a − y)
The potential due to the second charge at (0, −a) is found in an identical manner, only noting
that the vertical distance is now a+y:
q
2
d2 = x2 + (a + y) (4.68)
ke q ke q
V2 = =q (4.69)
d2 2
x2 + (a + y)
Finally, since potential is a scalar quantity (it has only magnitude, not direction), the super-
position principle tells us that the total electric potential at point (x, y) is just the sum of the
individual potentials due to charges 1 and 2:
ke q ke q
Vtot = V1 + V2 = q +q (4.70)
2 2
x2 + (a − y) x2 + (a + y)
Without resorting to approximations, there isn’t really a much more aesthetically pleasing form
for this one.
9. First of all, we should notice that the 7 µF capacitor has nothing connected to its right wire,
so it can’t possibly be doing anything in this circuit. We can safely ignore it. Next, the 3 µF
and 14 µF capacitors are simply in series, so we can readily find their equivalent capacitor:
(3 µF)(14 µF)
Ceff,3&14 = ≈ (2.65 µF)
(3 µF) + (14 µF)
This 2.65 µF effective capacitor is purely in parallel with the 6 µF capacitor. We can therefore
just add the two capacitances together and come up with an equivalent capacitance for the 3,
14, and 6 µF capacitors:
Finally, that equivalent capacitance is just in series with the 20 µF capacitor, so the overall
equivalent capacitance is readily found:
Ceff,3,14,&6 20 µF
Ceff, total = ≈ 6 µF
Ceff,3,14,&6 + 20 µF
10. Once again, we can simply use the principle of superposition. The total electric potential
at any point is just the sum of the electric potentials due to each point charge. We’ll label the
charges 1-3 from left to right, and calculate the potential due to each first.
If we take an arbitrary point on the y axis (0, y), what is the distance to charge 1? The vertical
distance will always be just y, and the horizontal distance is just d. Therefore, the distance d1
to the first charge is:
p
d1 = d2 + y 2 (4.71)
The electric potential V1 due to charge 1, +Q, is then found from Eq. 4.14:
ke Q ke Q
V1 = =p (4.72)
d1 d2 + y 2
The distance to charge 2 is simply y, since it is also located on the y axis. The electric potential
V2 due to charge 2 is then:
−2ke Q
V2 = (4.73)
y
Finally, the distance to charge 3 is just the same as the distance to charge 1. Since both charges
also have the same magnitude, V1 = V3 . The total potential at a point (0, y) is then just the
sum of the potentials from all three individual charges:
Vtot = V 1 + V2 + V 3 (4.74)
ke Q ke Q −2ke Q ke Q ke Q
= =p + + =p (4.75)
d1 2
d +y 2 y d1 d2 + y 2
2ke Q −2ke Q
= p + (4.76)
d2 + y 2 y
" #
1 1
= 2ke Q p − (4.77)
d2 + y 2 y
11. −9 µJ. The potential energy of a system of charges can be found by calculating the potential
energy for every unique pair of charges and adding the results together. In this case, we have
three unique pairings: charges 1 and 2, charges 2 and 3, and charges 1 and 3:
Here r12 is the distance between charge 1 and 2, and so on. Since we have an equilateral triangle,
all distances are 1 m. Since all charges are equal in magnitude, we can simplify this quite a bit
once we plug in what we know - we just need to keep track of the signs of the charges:
q1 q2 q1 q3 q2 q3
P Etotal = ke + +
r12 r13 r23
" #
10−9 C −10−9 C 10−9 C 10−9 C −10−9 C 10−9 C
2
9 N·m
= 9 × 10 + +
C2 1m 1m 1m
2
−18 2
N·m 10 C
= 9 × 109 2 [−1 + 1 − 1]
C m
= 9 × 10−9 N · m [−1]
≈ −9 × 10−9 J
2K
12. (a) Dielectric parallel to the plates: Ceff = 1+K C.
It is easiest to think of this as two capacitors in series, both with half the plate spacing - one
filled with dielectric, one with nothing. First, without any dielectric, we will say that the original
capacitor has plate spacing d and plate area A. The capacitance is then:
0 A
C0 = (4.80)
d
The upper half capacitor with dielectric then has a capacitance:
K0 A 2K0 A
Cd = = = 2KC0 (4.81)
d/2 d
0 A 20 A
Cnone = = = 2C0 (4.82)
d/2 d
1 1 1
= + (4.83)
Ceff 2KC0 2C0
4KC02
Ceff = (4.84)
2KC0 + 2C0
2K
= C0 (4.85)
1+K
(b) Dielectric “perpendicular” to the plates: Ceff = K+1
2 C.
In this case, we think of the half-filled capacitor as two capacitors in parallel, one filled with
dielectric, one with nothing. Now each half capacitor has half the plate area, but the same
spacing. The upper half capacitor with dielectric then has a capacitance:
K0 21 A K0 A 1
Cd = = = KC0 (4.86)
d 2d 2
The half capacitor without then has
0 21 A 0 A 1
Cnone = = = C0 (4.87)
d 2d 2
Now we just add our parallel capacitors:
1 1
Ceff = KC0 + C0 (4.88)
2 2
1
= (K + 1) C0 (4.89)
2
K +1
= C0 (4.90)
2
E LECTRIC current is something that we use and hear about every day, but few of us stop to
think about what it really is. What is an electric current? An electric current is nothing more
than the net flow of charges through some region in a conductor.
If we take a cross section of a conductor, such as a circular wire, an
electric current is said to exist if there is a net flow of charge through this
surface. The amount of current is simply the rate at which charge is flow-
ing, the number of charges per unit time that traverse the cross-section.
Strictly speaking, we try to choose the cross-sections for defining charge
flow such that the charges flow perpendicular to that surface, somewhat
like we did for Gauss’s law. Figure 5.2 shows a cartoon depiction of how
we define current.
Current is a flux of charge through a wire in the same way that water
flow is a flux of water through a pipe. As we shall see, this is a reasonable Figure 5.1: Georg Simon Ohm
(1789 – 1854) a German physi-
way to think about electric circuits as well – current always has to flow cist, who first found the rela-
somewhere, and you don’t want an open connection any more than you tionship between current, volt-
20
would want an open-ended water pipe. Voltage is more like a pressure age, and resistance.
gauge – you can have a voltage even when nothing is flowing, it just
means there is the potential for flow (nerdy pun intended).
If a net amount of charge ∆Q flows perpendicularly through a particular surface of area A within
a time interval ∆t, we define the electric current to be simply the amount of charge divided by the
time interval:
∆Q
I≡ (5.1)
∆t
This represents a conservation law as well. Charge can neither be created or destroyed. If we
have some steady stream of charge pouring into of a region of fixed volume, then the charge density
inside would continually grow (tending toward infinity!) if there were not also some compensating
flow of charges out of the volume. Putting it the other way around, if a steady stream of charges
were leaving the fixed volume, the charge density would also become infinitely large if there were
not some other source of charges to replace those lost. But creating charges out of thin air is the
148
5.2 Getting Current to Flow 149
one thing that definitely will not happen! Therefore, the change in the total number of charges in
a volume at any time has to equal the net flow of current through that volume, otherwise we would
require spontaneous generation of charge.i
Direction of Current Flow: the direction of current flow is defined as the direction
of net positive charge flow. The flow of electrons, which are usually responsible for the
current, is opposite the direction of current flow due to their negative charge.
on the conductor’s surface. If we then connect the conductor to ground, the excess charge will flow
down through the wire – this is a current! So this is one way to make electric currents - put some
excess charge on an isolated conductor, and take it away by connecting it to ground. Of course,
this is a somewhat cumbersome method . . .
More generally, what are we doing when we put excess charge on the conductor? We are
changing its electric potential relative to the ground. Let’s say we take electrons away from our
conductor, which makes its potential positive relative to ground. Once we connect the conductor
to a ground wire, electrons in the ground wire are attracted to our conductor and its relatively
positive potential, and they flow up from the ground into the conductor until the conductor is
at the same electric potential as the ground. Any time we can make one conductor, or part of a
conductor, at a different potential then another, current will try to flow between them. Try being
the operative word.
Whether any current does is another story. It depends on how we connect conductors which are
at different potential. If we make the potential difference big enough, though, the electrons will
always find a way to flow and make a current. Think about how this explains “static” shocks, or
the sparks from a Van de Graaff generator, for example.
So this is our answer – in order to get a net flow of charges, we need to provide a potential
difference (voltage!iv The presence of a voltage gives rise to an electric field across the conductor,
which in turn causes an electric force, which accelerates the charges. The effectiveness of a potential
difference to cause a current depends on the density of charge carriers, their average speed, and
microscopic properties of the conductor itself.
The free charges in conductors are extremely
numerous and fairly mobile, as we already
know. Inside a normal conductor, like cop-
per, there is a fantastic density of charge car-
riers, ∼ 1022 electrons per cm3 ! 21 So many, in
fact, that they continuously scatter off of each
other and the fixed atoms in the conductor
(about once every 10−14 sec or so, even in a
good conductor!). Typical drift speeds in cop-
Figure 5.3: (a) There are three positive charges moving to the
per are ∼ 10−3 −10−4 m/s for moderate electric right(+x direction), which count as +3, and one negative charge
fields, compared to the speed of random ther- moving to the left (−x direction). A negative charge moving
backwards is the same as a positive charge moving forwards,
mal electron motion of ∼ 105 m/s. 22 Any par-
which counts as −(−1) = +1, so the total relative current is
ticular charge carrier has a hard time getting 3 + 1 = 4. (b) Three positive charges moving to the right count
anywhere. Even though the charges are mobile, as −3, and one negative charge moving left counts as +1, for a
total relative current of −3 + 1 = 2. (c) Two negative charges
and able to move at fantastic speeds, the time moving to the left count as −(−2) = 2, so the total relative
it takes to actually get anywhere is quite a bit current is 2.
iv
From now on, we will interchangeably use the phrases “potential difference” and “voltage.” From our point of
view, they are the same thing.
∆x
vd ∆t
The number of charges which cross the surface A, those close enough to reach it in a time ∆t,
v
Distinct from and not to be confused with the random thermal motion, see below.
is just the number contained within the volume A · ∆x, or Avd ∆t. A bit more mathematically, we
can write this:
Here we have used n to represent the number of charges per unit volume, the carrier density. The
total amount of charge is the number of charge carriers times how much charge each one carries,
which we’ll call q. The current then is just the total amount of charge, N q divided by the total
amount of time, ∆t:
Current flow related to drift velocity:
∆Q Nq nqAvd ∆t
I= = = = nqAvd (5.5)
∆t ∆t ∆t
here A is the cross-sectional area of the conductor, n the number of charge carriers per
unit volume (their density), q is the charge on each, and vd is their drift velocity.
We can see that the drift velocity and resulting current are larger when the carriers carry more
charge q, or when their mass is small. However, it would be nice to have expressions that didn’t
directly involve the cross-sectional area of the conductor, so we can calculate general properties
independent of any particular conductor shape or size. For this reason, it is common to introduce
current density, J, which is just the current per unit area. Rewriting Eq. 5.5 in terms of current
density, we come up with a simpler and more general expression:
Current density related to drift velocity:
I
J≡ = nqvd (5.6)
A
Now we can calculate the current density for any given material of arbitrary geometry, and later
specify a cross-sectional area to determine absolute currents.
The units of current density: J is current per unit area, and has units of amperes
per square meter [A/m2 ].
∆V is to increase the drift velocity. This is basically true, but justifying that statement will require
a few more steps.
More accurately, the presence of a potential difference between two points on the conductor
means that those two points are at different potential energies. Recall that negative charges want
to move from regions of lower potential to regions of higher potential. In a conductor, even when a
current flows, the charges like to spread out as evenly as possible. This even and moving distribution
of charge gives rise to a uniform electric field. If the potential difference ∆V is applied over some
distance l, and the electric field is uniform, we know from Equations 4.12 and 4.13 that the electric
field along the length of the conductor must be given by:
∆V
E= (5.7)
l
The presence of the electric field causes an acceleration of the charge carriers:
Fe q
a= = E (5.8)
m m
Thus the acceleration of the charge carriers depends only on the electric field and their charge-mass
ratio, q/m, about 1.76×1011 C/kg for electrons. In order to figure out how much current will flow
for a given potential difference, we need to find a way to take into account the dissipative effect of
all the collisions the carriers are constantly undergoing. In a sense, the collection of charge carriers
is a bit like an ideal gas, and our treatment here is reminiscent of an ideal gas law derivation. The
analogy is a close one (and useful if you are a chemist) – the innumerable electrons in a conductor
are often called an electron gas.
If we assume the charge carriers are electrons, of mass me (and charge −e), then each has an
average momentum p = me vd .vi We expect on average that each collision an electron experiences
will completely destroy all forward momentum – they are stopped cold by every single collision. This
makes some sense, since most of the collisions will be with the atoms making up the conductor,
which are very heavy compared to electrons, rather than with other electrons. If all forward
momentum is destroyed, then the electron is left with only its random thermal motion. If there
were no electric force present to accelerate the electrons, the random thermal motion of all the
electrons will cancel out, and there is no net flow or current.
We can easily find the thermal velocity of the carriers just like we do for an ideal gas – the
thermal energy of the electrons is 32 kB T , where kB is Boltzmann’s constant, and we equate this to
vi
Since we are talking about zillions of collisions in every possible random direction, there is no need to carry
around the vector baggage here. We will just deal with scalar magnitudes.
3 1 2
kB T = mvth (5.9)
2 2
r
3kB T
=⇒ vth = ∼ 105 m/s (at 295 K) (5.10)
m
Here we use vth to specify the thermal velocity distinctly from the electric-field-induced drift velocity.
As it turns out, the thermal velocity typically greatly exceeds the drift velocity (by ten million times
or so!) – the acceleration of the carriers by the electric field induces only a tiny velocity compared
to that given by the random thermal motion of the carriers. Again, this is what leads to carriers
covering huge distances but having very small displacements. The overall motion is terribly chaotic,
and even fairly large electric fields only alter the carrier velocity in conductors by parts per million
at best. Still, the random thermal velocities do not contribute to the electric current,vii it is only
the tiny field-induced drift velocity that gives rise to electric current.
Boltzmann’s Constant:
We should also keep in mind that the collisions the carriers undergo are not continuous, but
happen one after another with some average time between them τ .viii In that time interval, the
electron loses its momentum me vd due to a collision, and thereafter regains it due to the action
of electric field present, only to lose it again about τ seconds later. As stated above, the presence
of the electric force Fe gives the electron an acceleration a = Fe /me , which allows it to regain its
former drift velocity. From kinematics, we would expect a mean displacement vd ≈ aτ .ix
The starting and stopping motion of the carriers gives us an average rate at which the electrons
are losing momentum due to the collisions and associated impulse forces. We can straightforwardly
find this momentum change as:
∆p me vd
= (5.12)
∆t
loss τ
Remember, ∆p/∆t is also a force – we are still dealing with kinematics, even though we have
involved electricity. Once the scattering event is over, the electron regains momentum through the
action of the electric force caused by the electric field. We can easily write down the momentum
vii
They do give rise to electrical noise, however.
viii
For Cu, we can estimate 21 τ ∼ 2 × 10−14 s.
ix
Depending on the method of derivation, there may be a factor of 2 in this expression, but the physics is the same.
∆p
= Fe = qE = −eE (5.13)
∆t
gain
Now, the total momentum loss has to equal the total momentum gain for there to be a steady
state. If this were not true, the momentum would quickly build up, and the whole wire would start
to move! So we must impose conservation of momentum:
∆p ∆p
= (5.14)
∆t
loss ∆t
gain
me vd
= −eEx (5.15)
τ
−eτ
vd = E (5.16)
me
Now we have an expression for the average drift velocity of electrons flowing along the wire, in
terms of the average time between carrier collisions:
−eτ
vd = E (5.17)
me
here τ is the mean time between collisions, and E the electric field.
The minus sign makes sense here, by the way. Since electrons are negatively charged, they
move in the opposite direction that the electric field lines point. It is also reassuring that the
drift velocity increases as τ increases, since more time between collisions means more time spent
accelerating, and that in principle lighter carriers would have a higher velocity since they are more
easily accelerated. Finally, the proportionality with the electric field is what we expect.
For typical metals, we can estimate 22 drift velocities of about 5×10−3 m/s for a moderate electric
field of 1 V/m, about eight orders of magnitude below the thermal velocity! Really, the effect of the
electric field is quite negligible in one sense, but it has profound consequences.
Another way of seeing this is as an application of Newton’s laws – ∆p/∆t is nothing more than
force, and the equations above are also in some sense a force balance between the electric force,
and the impulse force due to the collision.
Instead of dealing with the mean time between collisions, we could just as easily have started with
the mean distance that electrons travel before undergoing a collision.x This quantity is known as
x
Here we do mean the distance covered between collisions, not the displacement
the mean free path, λmfp , and it has essentially the same meaning as it does in the kinetic theory
of gasses. The shorter the time between collisions, the smaller the mean free path, and vice versa.
The mean time and mean free path are easily related through kinematics:
Here we are considering the total distance covered not just the net displacement, so we need to use
the total velocity, vd +vth . For the last relationship, we have made use of the fact that vth vd .
What this means is that the mean distance (and mean time) between collisions does not really
depend on the applied electric field, but really only comes from the random thermal motion of the
carriers.
As another aside, the proportionality constant between drift velocity and electric field in Eq. 5.17
is often called the carrier mobility, which is just what it sounds like. In this case, we write vd = µE,
where µ is the mobility:
Carrier mobility:
qτ
vd = µE with µ= (5.19)
m
where q is the charge of the carrier, m its mass, and vd its drift velocity. Mobility relates
the drift velocity of carriers to the applied electric field. The units for mobility are
m2 /V·s.
From the units of µ (m2 /V·s) and E (N/C or V/m), we can see that mobility is a quantity that
tells us how far a charge is able to move per second per unit of electric field (V/m). Now we have
a nice expression for exactly what we mean by mobility, rather than just a vague notion.
From here, the rest is easy, we already derived Eq. 5.6 above, relating drift velocity and current
density! Plugging Eq. 5.17 into Eq. 5.6:
I −eEτ ne2 τ 1
J= = nqvd = −ne = E≡ E (5.20)
A me me %
where A is the cross-sectional area of the conductor, E is the electric field, q is the
charge per electron of −e, n is the density of electrons in the material, τ is the average
time between electron collisions, and % is a constant of proportionality known as the
resistivity of the conductor.
In the end, it turns out that current density (or current) and electric field are simply proportional.
We could almost have guessed this in the first place, but now we have a formal relationship between
the two, and we even know the constant of proportionality. In this regard, we have sneakily defined
a new quantity %, the electrical resistivity, which is the constant of proportionality between current
density and electric field:xi
me
%= (5.21)
ne2 τ
where n is the density of electrons in the material, and τ is the average time between
electron collisions.
Resistivity represents the effectiveness with which a given electric field or potential difference
causes a current to flow, and is a (strongly) material-dependent property – it is a measure of
the resistance of a material to current flow. We see that the resistivity gets larger when the time
between electron collisions gets smaller, just as we would expect, and it gets larger when we increase
the density of free carriers. We will return to the resistivity of various materials shortly. We can go
further in our analysis by noting that the potential difference and electric field are simply related
by Eq. 5.7, E = ∆V /l, which leads us to:
I 1 ∆V %l
J= = or ∆V = I = %lJ (5.22)
A % l A
xi
We will use a slightly different rho character for resistivity, %, to distinguish it from the one we use for mass
density, ρ.
Ohm’s Law:
Current through and voltage across a conductor are proportional, the constant of pro-
portionality is the resistance of the conductor.
∆V ∆V
∆V = IR or I= or R= (5.23)
R I
5.4.3 Resistance
The presence of resistance does not mean that conductors “lose” current, greater resistance just
lessens the ability of a given ∆V to create a current. Resistance is somewhat analogous to
viscosity for a liquid or kinetic friction – it just makes it harder for charge to flow, a
larger resistance requires a larger potential difference for the same current.
The units for resistance R are volts per ampere [V/A], or Ohms [Ω]
Not all conductors follow Ohm’s law, it is only valid for certain materials (it is valid for most
metals). Those conductors that do follow Ohm’s law give a specific resistance R for a given ∆V
and I, and are called “ohmic.” Those that do not follow Ohm’s law are simply “non-ohmic.” A
Resistor is a circuit element made out of such an ohmic conductor, which provides a specific value
of R for use in a circuit. The circuit diagram symbol for a resistor is shown below, and Fig. 5.5
shows a picture of a common type of resistor.
Figure 5.6 shows current (I) as a function of voltage (V ) for a 200 Ω resistor (Ohmic), and
a red light-emitting diode (LED; non-Ohmic). For an Ohmic device, the slope of an I vs. V
curve is ∆I/∆V = 1/R. The higher the slope on the plot, the lower the resistance. The resistor
shows a constant slope, as expected, while the diode shows a slope which dramatically decreases
at higher applied voltages – the resistance decreases dramatically as V increases. Note that this
measurement is for a “forward biased” LED, the threshold voltage of ∼ 1.5 V is clearly visible. For
negative voltages, essentially zero current flows through the LED (and there is no light output).
35
to point b in the resistor, they lower their potential energy by ∆V . This might be more clear if we
purposely ground point a, Fig. 5.7b, which defines Va = 0. Now charges start out at zero potential,
and lose ∆V after traversing the resistor, and end up with potential Vb = −∆V . Similarly, if we
ground point b, Fig. 5.7c, then Vb = 0, and Va = ∆V . What we have just figured out is one of the
basic functionalities of a resistor – controlling the relative voltages between points a and b when a
current is flowing.
Incidentally, this all still works if we are using a voltage
(a) I source instead of a current source. If instead we applied a
voltage difference of ∆V between points a and b of the resistor
a b
using a voltage source, the potential difference would create a
∆V = Vb − Va = −IR current I = ∆RV , in accordance with Ohm’s law. Whether we
(b) I source constant voltage, constant current, or some combination
of the two, Ohm’s law is still valid for resistors.
a b
Va = 0 Vb = −IR
(c) 5.4.5 Resistivity of Materials
I
a b So far, we have mathematically derived the expected relation-
ship between current, voltage, and electric field. We have even
Va = +IR Vb = 0 found a way to relate the proportionality constants, resistance
Figure 5.7: (a) When a current I is and resistivity, to materials properties like the mean time be-
sourced through a resistor R, the voltage dif-
tween carrier collisions and mean free path. Do the dependen-
ference between the ends of the resistor is
∆V = IR. Alternatively, when a voltage ∆V cies we found make any sense, though, and how do we relate
is applied across the ends of the resistor, a this to what we will actually measure in the lab? What resis-
current I = ∆RV flows. (b) Grounding point
a on the resistor means Va = 0, and thus tance should we find for a particular copper wire, for example?
Vb = −IR. (c) Similarly, grounding point b
We can figure out if what we have makes sense qualitatively.
means that Va = IR. Resistors can thus be
From the discussion above, we can see that the drift velocity,
used to control voltage at specific points in a
circuit, or the amount of current flow.
and resulting current, get larger if we apply our potential dif-
ference between points very close together (make l small). We
might also expect that the current depends on how big the
cross sectional area A of the conductor is – the larger A is, the
more charge carriers per unit time will be able to flow through comfortably.
We would expect that for a given ∆V , applied over a conductor of length l, that I ∝ A and
I ∝ l−1 . This mostly makes sense (and it is what we have derived above) – we need thick wires
to carry large currents, and for long wires we need larger ∆V to make the same current flow. So
how do we find out the “intrinsic” resistance of a material, independent of size? In fact, this is
exactly what the resistivity % is. If we know the resistivity of a conductor, and its dimensions, we
can calculate the expected resistivity. And vice versa – if we know the resistance and dimensions
of a conductor, we can find its resistivity if we recall the definition of resistance based on Eq. 5.22.
RA %l
%= or R= (5.24)
l A
where % is the material’s resistivity, R is the resistance of the conductor, A is the cross-
sectional area of the conductor, and l is its length.
Resistivity is material dependent. Copper, for example, is a better conductor than steel, which
is one reason why we use it for the vast majority of wiring (also, it is reasonably cheap!). Resistivity
does depend on extrinsic parameters, however. For given samples of, e.g., copper, the resistivity
can vary wildly depending how pure the copper is, its microstructure, and many other factors.
Comparing resistivities between different materials absolutely is only truly valid for extremely
pure, perfect crystals at low temperatures. One can make a very dirty sample of copper that is a
worse conductor than steel at room temperature, for example. The resistivity for many common
conducting materials is listed in Table 5.1.
and it appears to be larger and fuzzier. As the temperature increases, the atoms appear slightly
“larger and fuzzier” from the electron’s point of view. This increases the scattering of the electrons,
the result of which is increased resistance.
Over a limited temperature range, the resistivity of many conductors increases linearly with
temperature, according to:
% = %0 [1 + α% (T − T0 )] (5.25)
56
ation of carefully fabricated platinum pieces is often used to
54
actually measure temperature. Figure 5.8 shows the roughly
52
linear behavior of resistance with temperature for a thin film
0 20 40 60 80 100 120
T (K)
of Cobalt. At the lowest temperatures, the resistance varies
very little – as thermal fluctuations become tiny, the dominant
Figure 5.8: Resistance vs. tempera- contribution to resistance is actually imperfections and impu-
ture for a thin film of Cobalt between 4.2 rities. At higher temperatures, above ∼ 80 K in this case, the
and ∼ 120 K. Above ∼ 80 K, resistance is
roughly linear with temperature, as implied observed resistance is roughly linear with temperature.
by Eq. 5.25.
A negative value of α% is observed for semiconductors. This
is because conduction in semiconductors is fundamentally dif-
ferent from that in metals. At lower temperatures, charges in semiconductors are weakly bound
to host atoms and not very mobile, leading to a high resistivity. At higher temperatures, random
thermal motion of the charges overcomes this weak bonding, and the charges actually become more
mobile at higher temperatures. Based on Eq. 5.19, we would expect a larger mobility to lead to a
larger drift velocity for the same applied electric field (or voltage), and hence a larger current. A
larger current for the same voltage or electric field means a lower resistance, and the resistance of
semiconductors decreases as temperature increases.xii
xii
We are ignoring the fact that the number of carriers also increases as temperature increases in a semiconductor,
which is significant and also causes resistance to decrease as temperature increases.
∆P E ∆Q∆V
= = I∆V (5.26)
∆t ∆t
Here we used Eq. 5.1. The rate at which the charges lose potential energy is equal to the rate at
which the internal energy of the resistor rises. Energy change per unit time is nothing more than
power (which we will denote by a fancy scripted P to avoid confusion with pressure).
P = I∆V (5.27)
In fact, Equation 5.27 is valid for any type of device, Ohmic or not. We didn’t make any special
assumptions, only that the packet of charge ∆Q passes through a net potential difference of ∆V ,
so this result works for any sort of electronic device, not just resistors. If we do have an Ohmic
device, say just a plain resistor, we know the relationship between I and ∆V from Equation 5.23.
Substituting that into the expression above:
∆V 2
P = I 2R = (5.28)
R
∆V = I/R
∆V = IR
R = I/∆V
I = ∆Q/∆t
1.5 Ω
6.6 × 10−4 Ω
1.5 × 10−9 Ω
1500 Ω
4. Which of the following does not obey Ohm’s law? Check all that apply.
A resistor
A slab of Copper
A diode
An insulator
A capacitor
5. Consider the positive and negative charges moving horizontally through the four regions
below. Which one has the highest current? Consider the +x direction to be to the right.
A
B
C
D
6. When we power a light bulb, are we using up charges and converting them to light?
Yes, charges moving through the filament produce “friction” which heats up the filament and
produces light
Yes, charges are emitted and observed as light
No, charge is conserved. It is simply converted to another form such as heat and light.
No, charge is conserved. Charges moving through the filament produce “friction” which heats
up the filament and produces light.
7. The drift velocity of charges in a typical copper wire is very small, ∼ 10−3 m/s. At this rate,
it would take about 15 minutes after flipping the switch for your lights to come on. Why do
your lights actually come on almost instantaneously?
Charges are already in the wire. When the circuit is completed, there is a rapid rearrangement
of surface charges in the circuit.
Charges store energy. When the circuit is completed, the energy is released.
Charges in the wire travel very fast
The circuits in a home are wired in parallel. Thus, a current is already flowing
9. Suppose a current-carrying wire has a cross-sectional area that gradually becomes smaller
along the wire, so that the wire has the shape of a very long cone. How does the drift speed
vary along the wire?
10. If the number of carriers in a conductor n decreases by 100 times, but the carriers’ drift
velocity vd increases by 5 times, by how much does its resistance change?
It increases by 20 times.
It decreases by 500 times.
It decreases by 20 times.
It increases by 500 times.
35
30 Red light-emitting diode 12. The figure at right shows the current-
200 Ω resistor
25
voltage relationship for a light-emitting
diode (LED) and a resistor. When the
I (mA)
20
voltage is 1.7 V, which has the higher re-
15
sistance?
10
The resistor.
5
The LED.
0
0 0.5 1 1.5 2 Cannot be determined.
V (Volts) They have the same resistance.
13. Suppose a (cylindrical) electrical wire is replaced with one of the same material, but having
every linear dimension doubled - the length and radius are twice their original values. Does the
new wire have:
5.7 Problems
2. In a time interval of 1.37 sec, the amount of charge that passes through a light bulb is 1.73 C.
How many electrons pass through the bulb in 5.00 sec?
3. A toaster is rated at 550 W when connected to a 130 V source. What current does the toaster
carry?
4. (a) A high-voltage transmission line with a diameter of 1.60 cm and a length 200 km car-
ries a steady current of 1000 A. If the conductor is copper wire with a free charge density of
n = 8.20 × 1028 electrons/m3 , how long does it take one electron to travel the full length of the
line?
(b) A high-voltage transmission line carries 1000 A starting at 600 kV for a distance of 150 mi.
If the resistance in the wire is 0.5 Ω/mi, what is the power loss due to resistive losses?
1. An electric current is the rate at which charge flows through a surface. If a net amount of
charge ∆Q flows perpendicularly through a surface cross section of area A in a time interval ∆t,
the electric current I is the net charge divided by the amount of time, as given by Equation 5.1
2. ∆V = IR correctly states Ohm’s law. Current through and voltage across a conductor are
proportional, the constant of proportionality is the resistance of the conductor. Ohm’s law is
stated in Equation 5.23
4. Diodes, insulators, and capacitors do not obey Ohm’s law. A resistor by definition obey’s
Ohm’s law. A normal conductor like copper also obey’s Ohm’s law. A diode has a non-linear
I − V relationship, and therefore does not obey Ohm’s law. An insulator has no mobile charges,
and cannot conduct current, so therefore does not obey Ohm’s law. A capacitor also does not
let a constant current pass through it, and does not obey Ohm’s law.
5. A has the largest current. There are really only three rules to keep in mind: (1) a negative
charge moving in one direction is the same thing as a positive charge moving in the opposite
direction, (2) a positive and negative charge moving in the same direction cancel out, and (3)
two charges of the same sign moving in the opposite direction cancel out. With that in mind ...
In figure A, there are 3 positive charges moving to the right, and two negative charges moving
to the left, the same as 5 positive charges moving to the right.
In B, four positive charges move to the left, which gives a negative current.
In C, two positive charges moving to the right and two negative charges moving to the left gives
the same as four positive charges moving to the right.
In D, this is the same as two positive charges moving to the left, for a negative current.
6. No, charge is conserved. Charges moving through the filament produce “friction” which
heats up the filament and produces light.
Charges are not used up, and charge cannot be converted to heat or light. The “friction” charges
experience is resistance, which leads to a conversion of the charges’ electrical potential energy
into vibrational energy in the wire (heat) through collisions between the charges and atoms in
the wire. The filament heats up due to the collisions between the charges and its atoms, and
glows at it gets hotter.
7. Charges are already in the wire. When the circuit is completed, there is a rapid rearrange-
ment of surface charges in the circuit.
This one can be solved by elimination if nothing else. Clearly, the charges in the wire are not
traveling very fast, the problem states this. That takes out the third answer. Wiring the house
in parallel does not make a difference – there is no current flowing through the light bulb when
the switch is off no matter how the house is wired. If there were a current already, the light
would be on! If this were true, what good would the switch be? There can be a current flowing
in adjacent circuits, but this is not relevant for the bulb itself. This takes out the fourth answer.
Charges do not store energy just sitting in a wire, their energy only changes by moving between
regions of differing electrical potential. Electrical potential energy is also not ‘released’ by the
charges. Once a current flows, the charges collide with the atoms and the electrical potential en-
ergy is converted to vibrational energy of the atoms in the wire. This process requires a current
to flow, so we still have to reconcile the tiny drift velocity with the almost instantaneous action
of the light switch. Electrical potential energy cannot just magically be converted to light. This
would be the same as saying that gravitational potential energy could just be released by an
object. How? And released to where? The second answer, though it seems halfway reasonable
at first, is just using a bunch of words that sound right in a non-meaningful way.
The real answer is that the wire is already full of charges. Turning on the light switch pushes
charges in one end of the wire, and this displaces the charges already in the wire all along its
length. The charges on the far end of the wire are pushed out as a result, and this is how current
flows almost instantaneously – even though a single charge moves slowly, each charge pushes its
neighbor further down the wire, and the net movement of charge occurs rapidly across the wire.
It is the same as turning on the hot water faucet in a way. Water comes out right away – the
pipe is already filled with water. Hot water only comes out after some time, since it takes a
while for water to go from the water heater to the faucet. Charges come out of the wire right
away, but they are not the same charges entering the other end of the wire – the wire is already
full of charge.
8. If you double the current through a resistor, the potential difference doubles. Since I = ∆V /R,
if I doubles and R remains the same, ∆V must also double. This is a conceptual question, but
one that is most easily answered with a bit of algebra. Recall the relation between potential
difference, current, and resistance (Ohm’s law):
∆V
R=
I
If we double the current I to 2I, and the resistance remains the same, it is easy to see that the
∆V must also double:
(?)∆V
R= =⇒ (?) must equal 2
2I
9. The drift speed increases as the cross section becomes smaller. We can relate current, area,
and drift velocity using Eq. 5.5:
I
I = vd nqA or vd =
nqA
This tells us that drift velocity scales inversely with the area, so if the area decreases, the drift
velocity must increase. Again, it works the same way for water in pipes – the smaller the pipe,
the higher the pressure and the larger the velocity.
10. This is easily answered with some algebra. First, we recall the relation between current
and drift velocity:
I = nqAvd
What we are really after is the resistance, however, which we can find with Ohm’s law:
∆V ∆V 1
R= = ∝
I nqAvd nvd
So the resistance is inversely proportional to the carrier density and drift velocity. Let’s say the
initial resistance is R0 , and the resistance after changing n and vd is just R. If we decrease the
number of carriers by 100 times, the resistance goes up by 100 times. If we increase the drift
velocity by 5 times, the resistance goes down by 5 times.
1
Ro ∝
nvd
1 1 20
R ∝ n
= nvd =
100 (5vd ) 20 nvd
=⇒ R = 20Ro
Even though we don’t know what the actual resistance R0 is, we can say that R is twenty times
more. The one tricky step here is to write down the proper relationship between resistance and
the given quantities, not just the relationship between current and the given quantities.
11. What we have to remember here is that grounding a point in circuit defines its potential
to be zero, so Vb = 0. First, consider the resistor R. If there is a current I flowing through
it from left to right, we know that the potential difference between points a and b must be
∆Vba = Vb−Va = −IR. That is, the presence of a current I means that there is a drop of potential
for charges going across the resistor. If we know that the potential at b is zero due to the ground
point, Vb = 0, then in order to satisfy ∆Vba = Vb −Va = −IR, we have to have Va = +IR.
12. The LED has the higher resistance. Resistance is just voltage divided by current. If we
pick a constant voltage of 1.7 V, then which ever component has a lower current has a higher
resistance. At 1.7 V, the curve for the LED is well below that of the resistor, so the LED has a
much smaller current at the same voltage, and thus a higher resistance.
13. Half the resistance. Let’s say the original resistance is R0 , and the original wire has a
length l0 and radius r0 . Since the material is the same, we can presume that the resistivity % is
the same as well. The original resistance can be written in terms of the resistivity, length, and
cross-sectional area (A = πr02 ) of the wire:
%l0 %l0
R0 = = 2
A πr0
The new wire, with every dimension doubled but the same resistivity %, has resistance:
%2l0 2% 1 %l 1
R= 2
= 4πr02 = 2 = R0
π(2r0 ) l 0 2 πr0 2
1. 2.9 × 10−4 Ω · m. We first need to know the relation between resistivity and resistance, which
includes the cross-sectional area of the wire A and its length l:
%l RA
R= or %=
A l
And then we add in the relation between current, voltage, and resistance, viz. R = ∆V /I.
∆V A
RA I ∆V · A
%= = =
l l I ·l
The wire is said to have a uniform radius, which can only be true if its cross section is circular.
The area of the circular cross section is then just A = πr2 . Making sure we keep track of the
units, we just plug everything in and run the numbers:
2
∆V · A 11 V · π 3.8 × 10−3 m V·m
%= = = 2.9 × 10−4 = 2.9 × 10−4 Ω · m
I ·l 0.45 A · 3.8 m A
2. 3.95 × 1019 electrons. We know that each electron carries a charge of −1.6 × 10−19 C, so
if we can figure out how much total charge has flowed through the bulb in 5 seconds, we can
divide by the charge per electron to get the total number of electrons. First, we can calculate the
amount of charge per second - the current - over the first 1.37 seconds from the given quantities:
∆Q 1.73 C
I= = = 1.26 C/s = 1.26 A (5.29)
∆t 1.37 s
Next, we can find the total charge that passes in 5 seconds by rearranging the formula:
Finally, we can divide the total charge by the charge per electron:
3. 4.23 A. We know that Watts (W) are a unit of power, and that electrical power can be
expressed as P = I∆V . Since we know P and ∆V , it is straightforward to find I, remembering
P 550 W
I= = = 4.23 W/V = 4.23 V · A/V = 4.23 A (5.32)
∆V 130 V
4. (a) about 16.7 years. First things first: to find out how long the electron to takes to travel
the length of the line, we need to know its velocity (since we already know the length). We can
calculate drift velocity from the density of electrons, their individual charge, the current, and
the cross-sectional area of the wire (noting that we are given the diameter, not the radius, and
converting that to meters):
I = nqvd A (5.33)
I
=⇒ vd = (5.34)
nqA
1000 A
= i (5.35)
m 2
(8.20 × 1028 electrons/m3 ) (1.60 × 10−19 C/electron) π 0.016
2
Here we used the fact that 1 A = 1 C/s to make the units come out properly. Next, given a
velocity vd and a distance d, we can calculate how long the journey takes:
d 200 × 103 m
∆t = = = 5.28 × 108 texts ≈ 16.7 yr (5.37)
vd 3.79 × 10−4 m/s
(b) 7.5 MW. The power loss in the wire is most easily calculated from the current and resistance:
P = I 2 R. We can find the resistance of the whole wire from the length and the resistance per
unit length:
P = I 2R (5.39)
2
= (1000 A) (75 Ω) (5.40)
= 7.5 × 107 A2 · Ω (5.41)
= 7.5 × 107 A2 · V/A (5.42)
= 7.5 × 107 V · A (5.43)
= 7.5 × 107 W = 75 MW (5.44)
174
6.1 Sourcing Voltage 175
Direct current: a constant flow of charges in the single direction, voltages and currents
do not change in time. It is often abbreviated dc or DC. dc is preferred.
In reality, pure voltage sources do not exist. Real voltage sources always have internal resistances,
which “use up” some of voltage, and they have power limits which restrict the amount of current
that can be sourced. A real voltage source is one which can supply, at best, a specified voltage, but
the actual output may be less. Let us make this clearer by example.
Real batteries always have some internal resistance r, as illustrated in Figure 6.2. Real batteries
therefore behave as a voltage source ∆V in series with an internal resistance r. This has the effect
that the voltage at the battery terminals is always less than the rated value. Consider the circuit
in Figure 6.3a, a battery specified to provide ∆V Volts connected to a resistor of value R.ii If we
neglect the internal resistance of the battery, the potential difference across the battery terminals
is ∆V as rated. The rated voltage of a battery is the idealized terminal voltage of the battery in
the limit that the battery itself has no internal resistance.
∆V
- + r
=
Figure 6.2: A real battery provides a voltage ∆V , but
has an internal resistance r. The actual output voltage
developed at its terminals depends on r and the resistance
of the circuit hooked up to the battery.
Now let us analyze the circuit of Figure 6.3b, the circuit diagram representation of the pictorial
version in Figure 6.3b. The battery itself is everything inside the blue rectangle, and is modeled as
a source of voltage ∆V in series with an internal resistance r.
a) b)
∆V
- + r
a b Figure 6.3: (a) A circuit consisting of a resistor con-
nected to the terminals of a battery. (b) A circuit dia-
I I gram of a source of voltage ∆V having an internal resis-
tance r, connected to an external resistor (load) R.
c d
R
First, consider what happens to a positive charge moving through the battery from point a to b.
As the charge goes from the negative to positive terminal of the battery, its potential increases by
∆V . Once it goes through the internal resistor r, however, its potential decreases by an amount Ir,
ii
We are assuming, as we almost always will, that wires connecting to the battery have no resistance.
where I is the current in the circuit. Thus, the voltage at the battery output terminals, points a
and b, is the raw voltage ∆V minus the loss of due to the internal resistance:
∆Vterminals = Vb − Va = ∆V − Ir (6.1)
This makes it clear that the voltage across the battery terminals ∆Vterminals is the same
as the rated voltage ∆V when the current is zero. This is why another name for the rated
voltage is the open-circuit voltage – rated and actual voltages are only the same for a real battery
when nothing is connected and no current flows.
Now consider the effect of connecting an external resistor in Figure 6.3b. The external resistor
(or resistive device, such as a light bulb) you are trying to power is often called the load resistance.
Since it is directly connected to the battery terminals, and the wires are assumed to be perfect,
it must have a potential difference across it of ∆Vterminals . The potential difference across the
load resistor and the current through it must also follow Ohm’s law, hence ∆Vterminals = IR (using
Eq. 5.23). Combining this with Equation 6.1, we can relate the rated battery voltage to the internal
resistance and the load resistance:
∆Vterminals = IR = ∆V − Ir (6.2)
=⇒ ∆V = IR + Ir (6.3)
The total rated voltage of the battery is partly spent on the load resistor and partly spent on
the internal resistance. This is just conservation of energy – charges must go all the way around a
closed loop and come back with the same energy. If not, we would have a perpetual motion device,
gaining energy out of thin air! Every bit of potential gained by charges from a voltage source must
be lost somewhere else in the circuit loop – in a resistor for example. We will see later that this is
part of a more general rule – the sum of voltage sources and voltage losses in a closed loop must
be zero.
Now that we have related the battery’s rated voltage ∆V to the internal and load resistances,
we can solve Eq. 6.3 for the current I through the battery and resistor:
∆V
I= (6.4)
R+r
where R is the resistance of the load connected to the battery, r is the internal resistance
of the battery, and ∆V is the rated open-circuit battery voltage.
Now it is clear that the current delivered by the battery through the resistor actually depends
on both the resistor’s value and the internal resistance of the battery. If R r, of course we do not
need to worry about the internal resistance of the battery. When the load resistance is high enough
that we can neglect the internal resistance, nearly all of the rated voltage is developed across the
load resistor. We can explicitly write down the actual voltage developed across the load resistor R
to make this more clear:
Voltage delivered to a load by a voltage source:
R
∆Vload = IR = ∆V (6.5)
r+R
where the quantities are the same as in Eq. 6.4. The voltage delivered to the load depends
on the value of the load and internal resistances.
Now it is even easier to see that when r is small enough to be neglected, the battery operates as
nearly an ideal voltage source, suppling almost the whole ∆V to the load itself. This is usually how
things work out.iii In a nutshell, to operate properly voltages sources like high load resistances
compared to their internal resistance.
• Equations 6.1 and 6.5 indicate that the terminal voltage of a battery depends on
its own internal resistance and the load resistance, so a battery is not a constant
voltage source.
• Equation 6.4 indicates that the current supplied depends on the load resistance, so
a battery is not a constant current source.
We can also find the power output of our battery by multiplying Equation 6.3 by I. Keep in
mind that the total power output is the total voltage ∆V times the total current I.
P = I∆V = I · I (r + R) = I 2 (R + r) = I 2 R + I 2 r (6.6)
The total power output I∆V of the battery is delivered both to the resistor and the battery’s
internal resistance, at the rate I 2 R to the resistor and I 2 r within the battery itself. Again, if
R r we do not need to worry about the power lost in the battery itself, and this is usually the
case. One thing to keep in mind: should you connect too small a load to the battery (for example
by short-circuiting it), such that r ∼ R, you will immediately notice the I 2 r power dissipated within
the battery itself – in the form of heat.
Just to be complete, we can also write the power output in another way, using Eq. 6.4:
R R
P = ∆V · ∆V = ∆V 2 (6.7)
r+R r+R
iii
If not ... you probably have a badly designed circuit, and very quickly, a dead battery!
The expressions above tell us what power is delivered by the battery or voltage source. Batteries
and other voltage sources typically have power ratings (in Watts) which tell you the maximum
P that can be delivered. From the equations above, it is straightforward to calculate the proper
resistances, voltages, and currents within a given power rating.
Everything above applies not just to batteries, but to any sort of voltage source. All real voltage
sources have an internal resistance, and are subject to the same considerations above. Batteries,
however, have an additional constraint that they have a limited capacity. The available capacity of
a battery depends upon the rate at which it is discharged – if a battery is discharged at a high rate,
the available capacity will be lower. Conversely, discharging a battery at a low rate prolongs its
life. Batteries are usually given a capacity rating of A·h or mA·h along with their rated voltage.iv
From the rated voltage and capacity, we can calculate a product of power and hours which tells us
how long a battery can deliver a certain power:
capacity · ∆V
P · hours = capacity · ∆V or hours = (6.8)
P
Circuit diagram symbol for a current source: I
We can approximate a current source, however, with a single battery and resistor. In the circuit
of Fig. 6.3, a battery with internal resistance connected to a load resistor, the current through
the load is given by Eq. 6.4. If we make the load resistor very small (or equivalently, make the
internal resistance of the battery very large), r Rload , then the current through the load resistor
is I ≈ ∆V /r. This does provide a roughly constant current, but the power loss in the internal resistor
will be severe, and it is generally impractical to construct a current source in this way.
How more realistic constant current sources work internally is a bit beyond the scope of our
discussion. However, that does not prevent us from seeing how they behave when connected to a
circuit. In the same way that a real voltage source can be considered an ideal voltage source in
series with a resistor, a real current source can be considered an ideal current source in
parallel with a resistor, as shown in Fig. 6.4.
iv
For example, a typical alkaline AA battery has a capacity of ∼ 2.85 A·h at its rated voltage of 1.5 V.
v
You may have already guessed that this is a serious oversimplification. You would be right.
r
Iload = I (6.9)
r+R
As you can see, the current through the load is independent of the load resistance R and nearly
equal to the source current I when r Rload . In other words, current sources want low load
resistances, in contrast to voltage sources. This brings up one answer to a common question: is
it better to source current or voltage? If the load you are trying to source has a large resistance,
sourcing voltage is generally better. If the load is small, sourcing current is generally better.vi
vi
For sources, internal resistance is often called “output resistance.” Good laboratory current sources can have
internal resistances above 1014 Ω, while good laboratory voltage sources can have internal resistances below 1 Ω, so
with good equipment either I or ∆V can usually be sourced without issues. Noise is what usually determines which
is actually used, but even so, the rule of thumb stated is still valid.
Figure 6.5 shows two resistors connected in series with a battery. The resistors could be, e.g.,
light bulbs or heaters, or just plain resistors. When the resistors R1 and R2 are connected to the
battery, the current through each resistor is the same. This makes sense – there is only
one single path in the circuit, so there can only be one current. This is because every charge that
flows through R1 must also flow through R2 and back to the battery. This is just conservation
of charge, in the same way we say that any water flowing into a pipe has to come out again.
We know the current is the same through both resistors, and conservation of energy tells us that
the potential difference between points a and c must equal the battery voltage ∆V , Equation 6.1.
The potential difference between a and c we can break up into the sum of the potential difference
between a and b and the potential difference between b and c: ∆Vac = ∆Vab + ∆Vbc . What is the
potential difference between points a and b? This is just the potential drop across the resistor R1 ,
IR1 . Similarly, the potential difference between points b and c is IR2 . Conservation of energy tells
us that the potential drop across both resistors together must equal the battery voltage:
The right hand side of this equation shows us that the potential drop across both resistors is
the same as it would be for a single resistor of Req = R1 +R2 . In other words, in series, resistors
just add together. No matter how many we have, the equivalent resistor of a series combination
is just the sum of the individual resistances. Notice, however, that the current through resistors
in series is the same. Further, since series resistors must have the same current, they all have the
same current as their equivalent resistance as well.
Req = R1 + R2 (6.11)
Req = R1 + R2 + R3 + . . . (6.12)
(a) (b)
Figure 6.6: (a) Two resistors in series can act as a
R1 signal ‘voltage divider,’ providing two different voltages from a
in R1
signal single supply. (b) Using a variable resistor, this is one
Vin R2 Vout volume
way to control audio volume. A sliding bar on the variable
R2
out resistor changes the values of R1 and R2 while keeping
their sum constant. When R1 is small and R2 large, most
of the signal reaches the output (sliding bar at the top).
When R1 is large and R2 small, most of the signal is sent
R2
Vout = V to the ground, and the output is small.
R1+R2 in
The potential difference (voltage) across each resistor is unless the resistors are identical, hence
series resistors are often called “voltage dividers.” Figure 6.6 shows how a voltage divider
can be used as an audio volume control, using a single variable resistor. This can be seen in Fig-
ure 6.5 – from the battery voltage ∆V we have generated two different (lower) voltages, ∆V1 = IR1
between points a and b, and ∆V2 = IR2 between points b and c.
Because charge has to be conserved, just like water flowing into a network of pipes eventually
has to come out, the current I that enters point a must equal the total current leaving
that point: I = I1 +I2 . Since the potential drop must be the same across both resistors, we can
easily find I1 and I2 :
∆V ∆V
I1 = and I2 = (6.13)
R1 R2
We want to find a single equivalent resistor Req , such that I = ∆V /Req . First, we just write
down the expression for the total current, and rearrange it a bit:
I = I1 + I2 (6.14)
∆V ∆V 1 1
= + = ∆V + (6.15)
R1 R2 R1 R2
R2 R1 R2 + R1
= ∆V + = ∆V (6.16)
R1 R2 R 1 R2 R1 R 2
∆V
= (6.17)
Req
Now we can equate the right-hand sides of Equations 6.16 and 6.17 to find out what Req is:
∆V R2 + R1
=
∆V (6.18)
Req R1 R2
1 R2 + R 1
= (6.19)
Req R1 R2
1 1 1 R1 R 2
= + or Req = (6.20)
Req R1 R2 R1 + R2
So now we have derived that resistors in parallel add inversely, just like capacitors in series. The
potential difference (voltage) across resistors in parallel is the same, and the equivalent resistance
is always less than the smallest resistance in the group.
1 1 1 R1 R2
= + or Req = (6.21)
Reff R1 R 2 R1 + R2
1 1 1 1
= + + + ... (6.22)
Req R1 R2 R3
The current through each is different, hence parallel resistors are often called “current
dividers.” This can be seen from Figure 6.7 – from a single current I, we have generated two
different and smaller currents I1 and I2 .
What happens this time if one of the resistors fails? The other continues to be powered this time.
Household circuits are wired in parallel, so that each device operates independently of the others.
Further, all devices operate at the same voltage when wired in parallel. If they were connected
in series, the voltage seen by each device would depend on how many devices were connected and
their individual resistances. Parallel wiring is why the lights do not dim when you turn on the TV!
The disadvantage to parallel wiring is that when one device fails, the others would suddenly
see a larger current – if R1 failed in Figure 6.7, R2 would suddenly see the full current I, not just
I2 , which could cause serious problems. In reality, circuit breakers are inserted in series with each
device, which limit the current to some maximum value (typically 15 or 20 A).
Probably you have experienced this problem with older holiday lights. These lights are
wired in series, so if any single bulb on the string becomes “open,” no bulbs will light.
Modern equivalents have an internal “shunt” that activates when the bulb’s filament
burns out to avoid this problem. When the filament breaks, they actually short-circuit
the bulb to keep the circuit continuous and the other bulbs in the string lit.
See http://en.wikipedia.org/wiki/Christmas lights for a more in-depth discussion.
Just as we saw with complex capacitor combinations (Sec. 4.6.4.3), once you know the rules for
series and parallel resistors, you can simplify most complicated resistor combinations.
Consider the example in Figure 6.8, where we have resistors R1 through R4 connected to a
batteryvii supplying a voltage ∆V . Now trace the wires from the negative pole of the battery. A
current I will be present in the single wire leaving the battery, and it will split up into I1 and I2
when it encounters the first junction. The currents I1 and I2 will recombines at the junction just
before R4 , and the current I goes back to the battery. Conservation of charge tells us I = I1 + I2 .
(a) I1 + I2 = I (b)
R2 R3
R2-3
Figure 6.8: (a) The current I leaving the bat-
I1 R1 R4 tery splits up into I1 and I2 when it encounters
R1 R4 the junction, and recombines at R4 . Conservation
I I2 I
of charge tells us I = I1 + I2 . We start to sim-
plify by combining the simple series pair R2 and
R3 into R2−3 . (b) Now we can combine the sim-
- +
- + ple parallel pair R2−3 and R1 into R1−2−3 . The
∆V current through R2−3 is still I1 , and the current
(c) (d) through R1 is still I2 . (c) Now we are left with
R1-2-3 R4 Req only a simple series pair, R1−2−3 and R4 , which
we combine into Req . The current through Req is
just I. Now we can work backwards and find I1 ,
I2 , and the voltage drop across each resistor.
- + - +
∆V
vii
We will assume that the battery’s internal resistance is negligible compared to any of the resistors R1 -R4 so we
may neglect it. If we want to include it, we can always do that by including a resistor r in series with the voltage
source, like in Figure 6.2.
We start to simplify by combining the simple series pair R2 and R3 into R2−3 , as shown in
Figure 6.8a-b. Equation 6.11 tells us:
R2−3 = R2 + R3 (6.23)
Once we have done that, Figure 6.8b, we have a simple parallel combination of R2−3 and R1 .
Equation 6.21 tells us that the equivalent resistance of these two, R1−2−3 (Figure 6.8c) is:
1 1 1 R1 R2−3
= + or R1−2−3 = (6.24)
R1−2−3 R1 R2−3 R1 + R2−3
Now we are left with only a simple series pair (Figure 6.8c), R1−2−3 and R4 , which we combine
into Req (Figure 6.8d). Note that the current through Req is just I.
Once we have a single resistor, Figure 6.8d, we know that ∆V = IReq = I (R1−2−3 + R4 ). If we
are given the values of the resistors and ∆V , we can calculate the current I, and the voltage drop
across R4 :
∆V4
I= and ∆V4 = IR4 (6.26)
R4
Working backwards to Figure 6.8c, we know that the total voltage drop across R1−2−3 and R4
together is ∆V . Since the voltage drop across R4 alone is ∆V4 = IR4 , and the total voltage in the
whole circuit has to be ∆V , the voltage across R1−2−3 has to be ∆V − ∆V4 . This is just conservation
of energy again.
Now going back to Figure 6.8b, we know that since R1 and R2−3 are in parallel, they have the
same voltage drop, which has to be ∆V − ∆V4 . This gives us immediately the current I2 in R1 :
∆V − ∆V4 ∆V − IR4
I2 = = (6.27)
R1 R1
∆V ∆V − ∆V4 ∆V ∆V − IR4
I1 = I − I2 = − = − (6.28)
R4 R1 R4 R1
Finally, back to Figure 6.8a, since R2 and R3 are in series, they have the same current I1 given
above. That gives us the voltage drops across R2 and R3 as I1 R2 and I1 R3 , respectively. And now
we know everything about this circuit! Well, except what possible use it might have ... but that is
another topic entirely.
Circuit diagram symbol for a voltmeter: V
Of course, the idea is to measure the potential difference while disturbing the circuit as little
as possible. For this reason, voltmeters have very high internal resistances (see Fig. 6.10a),viii and
no current flows through an ideal voltmeter. As an example, Fig 6.9a shows an incorrect use of
a voltmeter – connecting the voltmeter in series with the resistor and battery. No current flows
through an ideal voltmeter, so connecting the voltmeter in this way essentially opens the circuit
and nothing is measured. Figure 6.9b shows the proper use of a voltmeter – in parallel with the
component to be measured, a resistor in this case. The voltmeter probes the potential on both
sides of the resistor, but since no current flows through it, it does not affect the circuit.
Real voltmeters are not ideal, you might have guessed. A real voltmeter has a finite input
resistance, and, even when connected properly, draw a small amount of current. As shown in
Fig. 6.10, a voltmeter connected properly forms a parallel resistor network with the load resistor
viii
Good laboratory voltmeters can have internal resistances on the order 1010 Ω or more. For voltmeters, internal
resistance is often called “input resistance.”
Rload . What the voltmeter really measures then is not just the load, but the equivalent resistance
of the load in parallel with its own internal resistance r.
a) b)
Put another way, the voltmeter forms a current divider with the load, and “steals” part of the
current through the load. The voltmeter “stealing” part of the current obviously leads to inaccurate
results, and the measured voltage drop across the resistor is no longer IRload like we expect. We
should try to figure out how bad this problem is! If we assume there is a current I in the wire
leading to the resistor, we can readily calculate the voltage measured by the voltmeter:
The ratio between the measured voltage and the expected value is 1/(1 + Rload
r ), which tells us two
things. First, the measured value is always smaller than the true value, since 1/(1 + Rload r ) ≤ 1.
Second, so long as the load resistor is small compared to the internal resistance of the meter,
Rload r, the measured and expected values will be very close. Given the enormous internal
resistance of most modern voltmeters, this is usually the case, but one must still exercise caution.
Using a meter with insufficient internal resistance is known as “measuring the meter,” and is
something you will encounter in your laboratory experiments.
Circuit diagram symbol for an ammeter: A
(a) (b)
I I
Figure 6.10: (a) An ideal voltmeter has an in-
finite internal resistance, and no current flows
through it. Hence, it measures the true voltage drop
Rload V Rload r V across the resistor, ∆V = IR. (b) A real voltmeter
has a finite internal resistance r, and forms a volt-
age divider with the load resistor. Some current
flows through the voltmeter itself if Rload is com-
parable to r, and the measured voltage is less than
ideal real the true voltage on the resistor.
meter meter
∆Vmeasured
I= (6.30)
Rprecise
In this way currents can be measured reasonably accurately, but this is far from an ideal ammeter.
First, this technique of current measurement brings in all the non-idealities associated with real
ix
This is how we will measure currents in our laboratory sessions, see Appendix A
voltmeters as discussed above. Second, placing a resistor within the circuit of interest introduces
an additional voltage drop, which can affect other components. Care must be exercised when using
this technique. The precise resistor can be chosen carefully as not to introduce a sufficiently large
voltage drop to alter the circuit too much, the voltages on other components in the circuit must
be independently measured to take this effect into account, or the circuit must be designed from
scratch to account for this additional voltage drop.
Kirchhoff ’s Rules:
1. The sum of currents entering any junction must equal the sum of the currents
leaving that junction. a.k.a. the “junction rule.”
2. The sum of the potential differences across all the elements around any closed circuit
loop must be zero. a.k.a. the “loop rule.”
The junction rule is nothing more than conservation of charge. Whatever charge flows
into a given point in a circuit has to flow out again, charges are neither created nor destroyed.
Figure 6.13 illustrates this rule, that the current entering the junction (I1 ) has to be the same as
the sum of the currents leaving the junction (I2 + I3 ), that is, I1 = I2 + I3 . Again using our fluid
analogy, this would be the same as a “tee” in a water pipe. The flow rate into the pipe equals the
total flow rate out of the two branches.
The loop rule is nothing more than conservation of energy. Like we saw in Section 6.3.3,
any charge that moves around a closed loop in a circuit must gain as much energy as it loses.
Charges gain energy when going through a source of voltage, and lose energy by way of a potential
drop across a resistor. Charges also lose energy by going backwards into a source of voltage. As one
example, potential energy of the charge is converted into chemical energy when charging a battery.
I E
(a) (c) - +
a b a b
∆V = Vb - Va = -IR ∆V = Vb - Va = +E Figure 6.14: Rules for determining potential dif-
a b a b
∆V = Vb - Va = +IR ∆V = Vb - Va = -E
In general you can use the junction rule one time fewer than the total number of junction points
in the circuit. All this means is that one point has to be left such that your resulting circuit is
still a closed loop. The loop rule can be used as often as needed, until you have only one loop left.
Solving any particular problem requires as many unique independent equations as you
have unknowns. Figure 6.14 illustrates the rules for determining whether the voltage difference
across a resistor or battery is positive or negative when applying Kirchhoff’s rules.
In order to understand the power of Kirchhoff’s laws in analyzing complicated circuits (ones for
which the simple series and parallel rules will not work, for example), we should first analyze a
circuit we already understand. In that light, we will re-examine our parallel resistor circuit in
Fig. 6.7 according to our ‘Using Kirchhoff’s rules’ box below.
In this case, we have already assigned the proper symbols and labeled the currents in each
branch. The next step is to use the loop rule as many times as possible. For this circuit, there are
only two loops present - an upper one, containing R2 and I1 , and a lower one containing R1 and
I2 . For the upper loop, the loop rule says that the sum of all voltage sources and drops around the
loop must be zero. Traversing the loop clockwise from point a (an arbitrary choice), we first go
through R2 in the direction of the current, giving a voltage drop, and then through R1 against the
current, giving a voltage increase. The wires themselves are still assumed to be perfect, and give
no voltage changes. Accordingly, using the labeled current in each resistor:
0 = −I1 R2 + I2 R1 (6.31)
R1
=⇒ I1 = I2 (6.32)
R2
Now consider the bottom loop, again traversing clockwise from point a. We first go through R1
in the direction of the current, giving a voltage drop, and then go through the battery from − to
+, giving a voltage increase. Thus:
0 = −I2 R1 + ∆V (6.33)
=⇒ ∆V = I2 R1 (6.34)
∆V
and I2 = (6.35)
R1
R1 R1 ∆V ∆V
I1 = I2 = = (6.36)
R2 R2 R1 R2
That does it for the loop rule. Next, we need to apply the junction rule. Our only junctions are at
points a and b, and both give the same result:
I = I1 + I2 (6.37)
Now we can combine this result with the equations we got from the loop rule for I1 and I2 :
∆V ∆V
I = I1 + I2 = + (6.38)
R2 R
1
1 1
I = ∆V + (6.39)
R 2 R1
1 −1
∆V 1
=⇒ Reff = = + (6.40)
I R2 R 1
On inspection, this is exactly our formula for adding resistors in parallel. How about the case
of series resistors, Fig. 6.5? Much easier in fact. Again, we already have everything labeled, so we
just start with the loop rule – made simpler by the fact that we have only one loop now. We will
traverse it clockwise from point a once again, which means we first pass through resistors R1 and
R2 in the direction of the current, and then through the battery in the positive direction:
There is no junction rule in this case, since we have no junctions! So this is it, which you should
recognize as our formula for adding resistors in series.
That was easy enough, right? Of course, that means it is time to consider a more pathological
example, such as the circuit in Fig. 6.15a. Don’t panic! If we systematically follow the rules, and
our handy-dandy guide for using them (Page 191), we can make short work of this circuit too. The
first step is to label all of the components and assign directions to the separate currents in each
unique branch, Fig. 6.15b. Again, if we guess the direction incorrectly, it isn’t a big deal - the
sign of the current will just come out negative, letting us know that the direction is opposite what
we expected. What is important is just to put down something so that the rules can be properly
applied.
(a)
I1 I2
R4
∆V1 I3 ∆V2
R2 ∆V3
b
Just to make things more concrete, let’s give all of our components some real values too:
The next step is to apply the loop rule as many times as possible. In this case, we have two
distinct loops: a loop on the left, including only currents I1 and I3 , and a loop on the right, including
only currents I2 and I3 . We will start by applying the loop rule to the left loop, traversing clockwise
from point b and remembering to follow the sign conventions:
Make sure you understand why each term has the sign that it does, using Fig. 6.14 as a reference
if necessary. Next, we can apply the loop rule to the right side loop, this time starting from point
(a) and moving clockwise:
Still, we have too many unknowns and not enough equations. The next step is to apply the
junction rule at points (a) and (b):
I1 = I3 + I2 (6.46)
I3 = I1 − I2
In fact, we get the same result applying the rule at either point. This makes sense, based on
conservation of charge and the way the circuit is set up. Sometimes applying the loop and junction
rules give you duplicate results, this is not a problem per se. So how do we solve this mess of
equations? First, let’s put in the numbers we already know into Eqns. 6.44, 6.45, and 6.46, so we
know what we have to find yet in the first place:
Now, let’s rearrange and simplify Eqs. 6.47 and 6.48, and then substitute one into the other:
I3 = I1 − I2 (6.50)
10I1 + I3 = 17 (6.51)
=⇒ 10I1 + (I1 − I2 ) = 11I1 − I2 = 17 (6.52)
Next, do the same thing for Eq. 6.47 and Eq. 6.49:
I3 = I1 − I2 (6.53)
4I2 − I3 = −4 (6.54)
=⇒ 4I2 − (I1 − I2 ) = −I1 + 5I2 = −4 (6.55)
We’re nearly done. Notice the similarity of Eq. 6.52 and 6.55 ... let’s multiply Eq. 6.52 by 5,
and add that to equation 6.55:
55I1 − 5I2 = 85
+ −I1 + 5I2 = −4
54I1 = 81 (6.56)
This gives us I1 = 1.5 A. Now that we know I1 , we can put that into Eq. 6.55 and solve for I2 :
Finally, we can use Eq. 6.50 to determine that I3 = 2.0 A. Since both I1 and I3 came out positive,
this means our original guess for the directions was correct. However, since I2 came out negative,
that means our initial guess was incorrect, and the I2 actually goes the other direction. That was
it!
6.6 RC Circuits
So far we have worried only about circuits with constant currents. In this section, we will start to
analyze circuits whose current varies with time, though it is still in a single direction. The first
example we will consider is Figure 6.6a, a resistor, capacitor, a voltage source, and switch in series.
The switch S is open at first, and then suddenly closed. What happens? Before the switch S is
closed, no current can flow in the circuit. We also know that if we wait for a long enough time after
closing the switch, there can be no current – the capacitor will be charged to a value Q = C∆V , but
nothing else will happen.
As soon as the switch is closed, the voltage source ∆V begins to charge the capacitor C. What
the voltage source really wants to do is drive charges through the resistor and capacitor to create
a current. It can’t create a steady-state current in the capacitor, as we know, but the source is
persistent, and keeps sending charge to the capacitor as long as it can. It will keep doing this until
the capacitor is fully charged to its maximum value of Q = C∆V . The flow of charges out of the
source into the capacitor is, while it is going on, a current. The main difference now is that we
know this current can’t continue indefinitely, there is only a current present between the time we
close the switch S and the time when the capacitor is fully charged. In the end, the current into
the capacitor driven by the source diminishes over time, until it cannot pump any more charge into
the capacitor. Once the capacitor is full, we have reached our steady state of zero current.
We know the charge on the capacitor increases as a function of time, but in what fashion? There
is a simple formula for this, but the math behind its derivation is a bit tedious, and we will just
present the result. If we assume the capacitor is totally uncharged before we close the switch, and
we call the time at which the switch is closed t = 0, the charge on the capacitor varies according
to:
q(t) = Q 1 − e−t/RC (6.58)
where e = 2.71828 . . . is Euler’s number, the base of natural logarithms (ln). This is what you see
in the left part of the plot in Fig. 6.6b. We can see from this equation that the charge at t = 0 is
zero (q(0) = 0), and approaches its maximum value of Q as t → ∞ (q(∞) = Q). We can write the
voltage on the capacitor as a function of time as well, since the relationship ∆VC (t) = q(t)/C must
still be true:
Q
∆VC (t) = 1 − e−t/RC = ∆V 1 − e−t/RC (6.59)
C
In principle, this equation tells us that it would take an infinite amount of time to fully charge
the capacitor. This is just mathematics – the equation doesn’t know that charge is quantized and
comes in discrete bits of e = 1.6 × 10−19 C.
Question: Why does the discreteness of charge imply that the capacitor’s charging time
must be finite? Have you heard of Zeno’s paradox?
The term RC that appears in Equations 6.58 and 6.59 is curious. As it turns out, the units of
RC end up being time, and the quantity RC we call the time constant, τ .
τ = RC (6.60)
This gives τ in seconds [s] when R is in Ohms [Ω] and C is in farads [F].
What this means is that the product of the resistance and capacitance determine how
long it takes to charge the capacitor! If we wait a time τ after throwing the switch, one time
constant, our capacitor has charged to 63.2% (= 1! −1/e) of its maximal value Q. If you substitute
t = τ = RC in Equation 6.58 you can easily verify this. What is important is that the larger τ is,
the longer it takes to charge a capacitor, and the smaller τ is, the more quickly it charges.. At ten
time constants (t = 10τ ), the capacitor is over 99.99% charged.
Ok. What happens if we wait a long time, the capacitor is essentially fully charged now, and
we open switch S again? Well, all the charge we put on the capacitor is going to come right back
out. Just before we close the switch, the voltage on the capacitor is Q/C. Once we close the
switch, the charge flows back out of the capacitor into the resistor. Charges first leave the bottom
plate in Figure 6.6a and enter the resistor, which lets some charges move from the top plate to the
bottom plate of the capacitor. Lather, rinse, repeat, and after some time the capacitor is completely
discharged.
If we close the switch at t = 0, the charge on the capacitor varies as:
Again the time scale is in units of RC - after one time constant τ , we have now lost 63.2% of
the charge, so q = 0.368Q. We can write the voltage on the capacitor down too:
Q −t/τ
∆VC = e = ∆V e−t/τ (6.62)
C
Now we can better explain Figure 6.6b. In the experimental setup used (identical to your lab
hardware), R = 1500 Ω, C = 2200 µF, and ∆V = 2.0 V. At t = 20 sec in the graph, the switch is closed,
and the capacitor begins to charge. At t = 0, about 6 time constants later, the capacitor is about
99.8% charged and the switch is opened again. The capacitor discharges, and another 6τ later it is
nearly fully discharged.
1. In order to maximize the percentage of the power that is delivered from a battery to a device,
the internal resistance of the battery should be:
As low as possible
As high as possible
The percentage does not depend on the internal resistance.
2. Two resistors connected in series are measured to have an equivalent resistance of 1000 Ω.
The same two resistors in parallel are measured to have an equivalent resistance of 250 Ω. What
are the values of the resistors?
A A
ΔV ΔV
R1 R1 4. Refer to the figures at left. What happens to the
reading on the ammeter when the switch S is opened?
S S
the reading goes up
R2 R2
the reading goes down
the reading does not change
switch closed switch open
5. A light bulb has a resistance of 230 Ω when operated at a voltage of 120 V. What is the
current in the bulb? Recall 1 mA = 10−3 A.
1.92 mA
522 mA
245 mA
1.04 A
10 Ω 15 Ω 17 Ω
9. Kirchhoff’s rules result from two basic physical laws. What are they?
12. Two 1.60 V batteries - with their positive terminals in the same direction - are inserted in
series into the barrel of a flashlight. One battery has an internal resistance of 0.270 Ω, the other
has an internal resistance of 0.151 Ω. When the switch is closed, a current of 0.600 A passes
through the lamp. What is the lamp’s resistance?
2.25 Ω
3.73 Ω
4.91 Ω
6.80 Ω
13. A flashlight uses a 1.5 V battery with a negligible internal resistance to light a bulb rated
for a maximum power of 1 W. What is the maximum current through the bulb? Assume that
the battery has more than enough capacity to drive this current, i.e., it is ideal.
0.67 A
1.50 A
2.25 A
0.50 A
7Ω
9.9 V
8.2 V
0.9 V
4.5 V
(a) (b)
A
16. Refer to the figure at right. Which circuit
properly measures the current and voltage for
the resistor? You may assume that the volt- V A
meters and ammeters are perfect, and the bat-
tery is ideal. V
(c) (d)
circuit (a)
circuit (b) V A
circuit (c)
circuit (d) V
A
200 Ω · m
2.9 Ω · m
2.0 × 106 Ω · m
2.9 × 10−4 Ω · m
6.8 Problems
1. A regular tetrahedron is a pyramid with a triangular base. Six 14.0 Ω resistors are placed
along its six edges, with junctions at its four vertices. A 9.0 V battery is connected to any two
of the vertices. (a) Find the equivalent resistance of the tetrahedron between these vertices.
(b) Find the current in the battery.
2. A group of students on spring break manages to reach a deserted island in their wrecked
sailboat. They splash ashore with fuel, a European gasoline-powered 240 V generator, a box of
North American 100 W, 120 V lightbulbs, a 500 W 120 V hot pot, lamp sockets, and some in-
sulated wire. While waiting to be rescued they decide to use the generator to operate some bulbs.
(a) Draw a diagram of a circuit they can use, containing the minimum number of lightbulbs
with 120 V across each bulb, and no higher output.
(b) One student catches a fish and wants to cook it in the hot pot. Draw a diagram of a circuit
containing the hot pot and the minimum number of lightbulbs with 120 V across each device,
and not more. Find the current in the generator and its power output.
3. You need a 45 Ω resistor, but the stockroom has only 20 Ω and 50 Ω resistors. How can the
desired resistance be achieved under these circumstances?
2. Both are 500 Ω. Call the two resistors R1 and R2 . Connected in series, their equivalent
resistance is R1 +R2 = 1000 Ω. Connected in parallel, their equivalent resistance is 1/R1 +1/R2 =
250 Ω.
R1 + R2 = 1000
1 1 1
+ =
R1 R2 250
1 1 1
+ =
R1 1000 − R1 250
1000 − R1 + R1 1000 1
= =
R1 (1000 − R1 ) R1 (1000 − R1 ) 250
1000R1 − R12 = (250) (1000)
R12 − 1000R1 + 250000 = 0
⇒ R1 = 500 Ω = R2
So there is a bit of math, but it works out in the end. Alternatively, a simpler way is to just
look at the possible answers and try them out!
4. The reading goes down. When the switch is closed, we have R2 in parallel with a switch.
Switches (ideally) have zero resistance, so all the current goes through the switch and none
goes through R2 – if we calculate the equivalent resistance between R2 in parallel with zero,
the equivalent resistance is still zero. Thus, the battery is connected effectively only to R1 , and
there is a current of:
∆V
Iclosed =
R1
When the switch is opened, resistors R1 and R2 are now in series, so that the total circuit
resistance is larger than when the switch was closed. As a result, the current decreases, since
the applied voltage is the same in both cases. The total current is now:
∆V ∆V
Iopen = < = Iclosed
R 1 + R2 R1
No matter what R1 and R2 are, since resistances are always positive, the current has to be
smaller when the switch is open.
5. 522 mA. We know the resistance R = 230 Ω, and the voltage V = 230 Volts. We can get the
current from Ohm’s law:
V V
R = ⇒ I=
I R
120 Volts Volts
I = = 0.522 = 0.522 Amps = 522 mA
230 Ω Ω
6. 1.5 Ω. We have 135 resistors in parallel R1 through R135 , all of the same value. We know
that the equivalent resistance must be:
1 1 1 1 1 135
= + + ... + = 135 = (6.63)
Req R1 R2 R135 R1 200
200
So Req = 135 ≈ 1.5 Ω.
7. 17.3 Ω. First, note that you can combine the middle two resistors (7 Ω and 11 Ω) which are
just in a simple parallel combination. The equivalent resistance for these two is:
1 1 1
= + = 0.234
Req, 7-11 7 11
⇒ Req, 7-11 = 4.28 Ω
Now we have three resistors in series - 4 Ω, 4.28 Ω, and 9 Ω. Resistors in series just add together,
so the total equivalent resistance is:
8. 54 Ω. Note that the 17 Ω resistor is only connected on one end, so it doesn’t do anything!
First, combine the 10 and 15 Ω resistors in series to make 25 Ω. This 25 Ω effective resistor is
then in parallel with the 50 Ω resistor. Combining those two makes (approximately) 17 Ω, which
is now purely in series with the 37 Ω resistor. Adding those two together gives you, to two
significant figures, 54 Ω.
Conservation of momentum played no direct role in the two rules stated. It did help us derive
Ohm’s law in a simple way, but it does not lead us to the rules above. Coulomb’s law does
not directly lead us to rule (1) or (2) – it deals with electric force, whereas rule (1) deals with
electric potential. At the very least, we need Coulomb’s law plus a bit of calculus to get rule (1),
and it will not get us rule number (2). Finally, charge quantization does not imply conservation
of charge. Charge quantization just says that charge comes in discrete units of e, it does not
tell us that charges cannot be created or destroyed.
10. 2 A. After a long enough time, the capacitor will be completely charged. A current only
flows in a capacitor while it is charging or discharging. Even during charging and discharging,
the current steadily decreases with time until the capacitor is completely full or empty, respec-
tively. Since the problem says “steady-state”, we may assume that the capacitor is no longer
charging – if it were, the current would not be steady, but decreasing, and after a long enough
time, the capacitor should be fully charged anyway.
If the capacitor is fully charged and no current flows through it, then there is also no current
through the 1 Ω resistor in series with it. If there is no current through the resistor either, then
there is no voltage drop across it, and that whole branch of the circuit actually does nothing.
Remember, if no current flows through a path in a circuit, it isn’t doing anything except possi-
bly storing energy. Portions of a circuit with no current can almost always be neglected when
analyzing the rest of the circuit.
If the 1 mF-1 Ω branch of the circuit can be neglected, then the only things left are a single 6 V
battery, a 1 Ω resistor, and a 2 Ω resistor, all in series. Finding the current now is a simple matter,
since the 1 Ω and 2 Ω resistors in series just make an equivalent resistance of 3 Ω. Effectively,
we have a single battery and resistor, for which we can easily calculate the current:
∆V 6V
I= = = 2A
Req 3Ω
11. 5=6, 1=2=3=4. We only need to remember three things to figure this one out: (1) when
a current encounters a junction, it splits up to take each path in amounts inversely proportional
to the resistance of the path, (2) the current through a single loop of a circuit is the same
everywhere, and (3) related to the last point, charge must be conserved, such that the same
number of charges entering a wire have to leave it.
First, think about a current leaving the battery at point 5 and traveling clockwise around the cir-
cuit. The current reaches the junction leading to points 1 and 3, and must split up to take both
paths. Since both paths have the same resistance (the resistors are equivalent, remember), the
current will spit up equally between the two. Therefore, the current is the same at points 1 and 3.
The current in the path from 1-2 or 3-4 is in just a single wire, and the current can’t change.
Conservation of charge requires that every charge entering point 1 leaves through point 2 (and
the same for points 3 and 4). Therefore, the currents at points 1 and 2 are equal, and so are
those at points 3 and 4. Putting everything so far together, the current is the same at 1, 2, 3,
and 4.
What about the currents at points 5 and 6? Conservation of charge again requires that the
charges leaving the battery at 5 must eventually come back through point 6 – no charge can
be gained or lost when going around the loop. Therefore, the currents at points 5 and 6 must
be the same. Further, since the whole current leaving the battery at point 5 splits up into two
separate (and equal) currents at points 1 and 3, the current at point 5 must be larger than the
current at points 1 and 3. Therefore, overall the ranking from highest to lowest must be 5=6,
1=2=3=4.
12. 4.91 Ω. Each battery has an internal resistance that acts in series. Once the bulb is
connected, it is also in series with both batteries. All we have two batteries and three resistors
in series, nothing more. The sum of the voltage sources (the batteries) has to equal the sum
of the voltage drops (current through the resistors) around the whole circuit - conservation of
energy again. We know the current, the value of two of the resistors, and the voltages on the
batteries. Let our unknown lamp resistance be r:
13. 0.67 A. Basically, all we need to remember is the relationship between power P, current I,
and voltage ∆V :
P = I∆V
1 W = I (1.5 V)
1W
=⇒ I = ≈ 0.67 A
1.5 V
14. 347 mA. In a preceding problem, we found the equivalent resistance of this circuit to be
17.3 Ω. This single effective resistor is connected to a 6 V battery, so the current in the effective
resistor has to be:
∆V 6V
Ieq = = ≈ 0.347 A = 347 mA (6.69)
Req 17.3 Ω
Now, think about the circuit topology. The current through the equivalent resistor is the same as
that through the 9 Ω resistor! If we work backwards from finding the overall equivalent resistor,
the equivalent resistor decomposes into a composite of the 4, 7, and 11 Ω resistors and the 9 Ω
resistor in series. Since two series resistors must both have the same current, they both have
the same current as their equivalent resistance as well, and the current in the 9 Ω resistor must
be 348 mA.
15. If we treat the battery as a perfect voltage source in series with its internal resistance, then
the whole circuit under consideration is a perfect source of 9 V, a 1 Ω resistor, and a 10 Ω resistor
all in series. The fact that they are all in series means they all have the same current. The
internal resistance and the 10 Ω load resistance in series are equivalent to a single 11 Ω resistor,
which means that effectively a perfect 9 V battery is connected to a single 11 Ω resistor. In that
case, we can find the voltage across the 10 Ω resistor by first finding the current in the single
loop of the circuit:
∆V 9V
I= = ≈ 0.818 A
Req 11 Ω
The voltage across the 10 Ω resistor is then just given by Ohm’s law:
16. (a) is the only correct schematic. Remember: voltmeters have enormous internal
resistances, and must be in parallel with what they are measuring. Ammeters have tiny internal
resistances, and must be in series with what they are measuring. Based on this alone, (a) is the
only correct schematic.
Circuit (b) is wrong because the ammeter is connected in parallel with the resistor. The am-
meter’s resistance is sufficiently low (zero, ideally) that it will ‘steal’ all of the current from the
resistor instead of measuring it. The same effect could be had by just connecting a short-cut
wire across the resistor – the ammeter effectively takes it out of the circuit by providing a far
lower resistance path, such that little current will actually go through the resistor. The fact that
a low equivalent resistance is connected to the battery means a large current will flow, quickly
draining the battery. The voltmeter is connected correctly, but in this case it will basically only
measure the voltage drop across the ammeter itself.
Circuit (c) is wrong because the ammeter is in series and the voltmeter is in parallel. The
enormous resistance of the voltmeter (infinite, ideally) means that almost all of the battery’s
voltage will be dropped across the voltmeter itself, and almost none will be left for the ammeter
and resistor. Since the ammeter effectively short-circuits the resistor anyway, this circuit will
measure neither I nor ∆V correctly.
Circuit (d) is wrong because again the voltmeter is in series. The ammeter is correct, but the
high resistance of the voltmeter will prevent all but the most miniscule currents from flowing
anyway, so there will be nothing to measure!
17. 2.9 × 10−4 Ω · m. We first need to know the relation between resistivity and resistance,
which includes the cross-sectional area of the wire A and its length l:
%l RA
R= or %=
A l
And then we add in the relation between current, voltage, and resistance, viz. R = ∆V /I.
∆V A
RA I ∆V · A
%= = =
l l I ·l
The wire is said to have a uniform radius, which can only be true if its cross section is circular.
The area of the circular cross section is then just A = πr2 . Making sure we keep track of the
units, we just plug everything in and run the numbers:
2
∆V · A 11 V · π 3.8 × 10−3 m V·m
%= = = 2.9 × 10−4 = 2.9 × 10−4 Ω · m
I ·l 0.45 A · 3.8 m A
1. COMING SOON.
2. COMING SOON.
3. Put two 50 Ω resistors in parallel, and connect that combination in series with a
20 Ω resistor. There are many other ways, this is perhaps the simplest.
M AGNETISM is a crucially important areas of applied physics, more so than you may be
aware. Everything from motors to loudspeakers to Magnetic Resonance Imaging (MRI)
relies on magnets and magnetic fields. Though magnetism may seem like a phenomena completely
distinct from electricity (and often less intuitive), in fact they are both different aspects of the
unified force of “electromagnetism.” Using what we have learned from special relativity, we will be
able to prove that electric and magnetic fields are really the same thing.
The electric fields and potentials we studied in Chapters 3 and 4 resulted
from static distributions of electric charges in space. Magnetic fields, on
the other hand, come from moving charges - the electric currents we studied
in Chapters 5 and 6.
Magnetic fields affect moving charges, and conversely, moving charges
produce their own magnetic fields. Another aspect of this symmetry be-
tween electric and magnetic fields is that time-varying magnetic fields in-
duce electric fields, and vice versa. Electric and magnetic fields are fun-
damentally linked in their behavior in the time domain - the static aspect
of one field is no more than the dynamic manifestation of the other. Figure 7.1: Hans Chris-
tian Øersted (1777 – 1851), a
However, It was not until 1820 that a formal link was established
Danish physicist and chemist
between the sciences of Electrostatics and Current Electricity and mag- who discovered the link be-
netism. In that year Øersted (Fig. 7.1 discovered that a magnetic compass tween electricity and mag-
netism. 26
needle was deflected by an electric current - in other words, electric cur-
rents produce magnetic fields. Within a few short months, Ampère (Fig. 6.1) had developed a theory
integrating electricity and magnetism. This theory is symbolized by the notion of equivalence of a
magnetic dipole (e.g., a bar magnet or a solenoid) and an electric dipole.
While studying electric fields and forces, we described the interactions between charged objects
in terms of electric fields. We said that an electric field surrounds any electric charge (or charge
distribution), and that the presence of an external electric field causes electric charges (or charge
distributions) to accelerate. When charged objects are stationary, knowledge of external electric
fields and the object’s own electric field is sufficient to describe the static interactions between
them.
The situation is different when charges are moving relative to one another or an external observer.
Our first experience with moving charges was in the form of an electric current, the net flow of
charges through some region in space. We discussed the relation between current and electric
209
210 7.1 Magnetic Fields and Forces
potential, but curiously neglected to discuss the electric fields around moving charges. Indeed,
the interaction between moving charges is qualitatively different in many respects, and for this
reason the magnetic field is introduced. Moving charges are said to give rise to magnetic fields,
which are treated separately from (but on equal footing with) electric fields. What we should not
lose sight of is that, in fact, electric and magnetic fields represent the same fundamental force of
electromagnetism, merely in different guises.
In this picture, in addition to containing an electric field, the region of space surrounding any
moving electric charge also contains a magnetic field. A magnetic field also surrounds a magnetic
substance making up a permanent magnet. This is because permanent magnets can be viewed in
some sense as being made up microscopically of tiny current loops.i
Historically, the symbol B has been used to repre-
(a) (b) sent a magnetic field. The direction of the magnetic
field B at any location is the direction in which a
compass needle would point at that location - mag-
N S ~ just as the electric field E
netic field is a vector B ~ is.
As with the electric field, we can represent the mag-
netic field by means of drawings with magnetic field
Figure 7.2: (a) Field lines from a bar magnet, as visu-
lines. Figure 7.2 shows how the magnetic field lines
alized by spreading iron filings around the magnet. 27 (b) of a bar magnet behave. Magnetic field lines point
Schematic illustrating the magnetic field lines from a bar
away from north poles, and toward south poles, as
magnet.
electric field lines point away from positive charges
and toward negative charges. The main difference between the magnetic and electric aspects of
the electromagnetic force is that there are no isolated magnetic charges, magnets always come in
north-south pole combinations. You can verify this by breaking a magnet in half - this does not
separate the poles, but produces two magnets with two poles each.
Figure 7.2a displays the field lines around an ordinary bar magnet, as visualized by spreading
iron filings around the magnet, while Figure 7.2b shows a schematic illustration of the field lines and
direction. Figure 7.3 shows the field lines for two bar magnets brought close together, connected
‘north-north’ and ‘north-south’. In the region between two opposite poles, Fig. 7.3a, the field
lines are straight lines, representing a constant magnetic field of uniform direction. This is what
happens when you break a single bar magnet in half, and move the pieces apart. Between like
poles, Fig. 7.3b, the magnetic field vanishes where the fields from each pole cancel, and the field
lines repel each other.
Associated with the presence of a magnetic field is a certain amount of potential energy, as with
an electric field. Though we will not go into detail here, the energy tied up per unit volume goes
as the square of the magnetic field-line density.
i
A modern quantum physics view of the problem recognizes that electrons themselves have tiny magnetic moments,
called spin, which are the cause of most magnetism we are familiar with. This does not affect our discussion, however.
(a) (b)
S N S N S N N S
Figure 7.3: (a) Magnetic field pattern surrounding two bar magnets aligned N-S. Note that the field is reinforced in the region
between the two magnets. (b) Magnetic field pattern surrounding two bar magnets aligned N-N. Note that the field is weaker
between the two magnets, and cancels along a vertical line equidistant between them.
Figure 7.4 illustrates the direction of the magnetic force, magnetic field, and velocity
vectors.
~
FB = q~ ~
v×B or |~
FB | = q|~ ~ sin θvB
v||B| (7.1)
The SI unit of magnetic field strength is the tesla (T), whereas the SI unit of magnetic flux
(magnetic field lines flowing through some area, like electric flux) is the weber (Wb). 1 weber = 1
tesla flowing through 1 square meter, and is a very large amount of magnetic flux. If the magnetic
force is in newtons, velocity in meters per second, and magnetic field in tesla, we can see from
Equation 7.1 that a charge of 1 C moving perpendicularly to a magnetic field of 1 T with a speed
of 1 m/s experiences a force of 1 N. Of course, we can also see that for a stationary particle or any
uncharged particle, there is no force, and there is also no force when ~ ~ Since the
v is parallel to B.
magnetic force is always perpendicular to the velocity, it never changes the energy of the charge it
acts on. However, since the magnitude of the magnetic force does depend on the charge, it cannot
strictly be classified as a conservative force either.
FB
FB
v Figure 7.4: (a) The direction of the magnetic
force on a positively charged particle moving with
+ a velocity ~v in the presence of a magnetic field.
B v When ~ v is at an angle θ with respect to B,~ the
magnetic force is perpendicular to both ~ ~
v and B.
θ (b) Oppositely directed magnetic forces F~ B are ex-
+q - B erted on two oppositely charged particles moving at
v FB the same velocity in a magnetic field. The dashed
lines show the paths of the particles.
(a) (b)
Figure 7.4 illustrates the vector relationship between force the force ~FB , the velocity of a pos-
itively charged particle ~ ~ The force ~
v, and the magnetic field B. FB will act in opposite directions
on positively and negatively charged particles as the electric field does, and just as we expect from
Eq. 7.1 since ~
FB is proportional to q. This is important to keep in mind, particularly since electric
currents are almost invariably made up of moving negatively charged electrons, while mass spec-
trometers (Sect. 7.3.1.1) often involve the motion of positively charged ions. In other words, you
will have to deal with both positive and negative cases, so be careful about signs and directions!
contraction. When we view the spacing lO of the positive charges in the lab frame O, we are viewing
the contracted length. In the test charge’s frame O0 , we must un-contract the spacing lO into the O0
frame to figure out what the test charge really sees. If we call the spacing of the positive charges
that the moving test charge experiences in its frame O0 as l+O0 , we can easily relate it to the spacing
O0
l+ = lO γ (7.2)
O0 lO
l+ =q (7.3)
v2
1− c2
Since we know γ ≥ 1, it is clear that the spacing the test charge sees is larger than what we
see in the lab frame. Meanwhile, what about the negative charges, which are stationary in the lab
frame? The test charge sees from its frame the negative charges moving to the left with velocity
~ O0 the test
v, so their spacing must be contracted to figure out the spacing of the negative charges l−
charge sees:
O 0
γl− = lO (7.4)
O 0 lO
l− = (7.5)
γ
r
O 0 v2
l− = lO 1− (7.6)
c2
Again, since γ ≥ 1, the positive test charge sees a reduced spacing of the negative charges. Since
the positive and negative charges now no longer appear to have the same spacing when viewed from
the test charge’s frame, the test charge sees a net negative charge density, since there are effectively
more negative charges per unit length than positive charges. The presence of a net negative charge
density from the test charge’s point of view means that it experiences a net attractive force from
the wire. From the lab frame, we would not expect any force between the test charge and the wire,
but sure enough, a proper relativistic treatment leads us to deduce that a force must in fact be
present.
How big is the force? First, we need to figure out the charge density in the wire that the test
charge sees. Since we don’t want to restrict ourselves to any particular length of wire, we will
0
calculate the number of charges per unit length as viewed in the test charge’s frame, λO . How
do we find this? We know that all charges in the wire have charge q, and we know their average
spacing. Dividing q by the average spacing for each kind of charge will give us the number of
charges per unit length for both positive and negative charges, and subtracting those two will give
use the net charge density:
0 0 0
λO = λ O O
+ − λ− (7.7)
q q
= O0 − O0 (7.8)
l+ l−
r
q v2 q 1
= O 1− 2 − Oq (7.9)
l c l 2
1 − vc2
r
q v2 1
= O 1− 2 − q (7.10)
l c 1− v
2
c2
This is a bit messy. However, we know that the drift velocity of charges in a conductor is
very small compared to c (×10−3 m/s, see Sect.5.4.1). When v c, we can use the following
approximations:iii
1 1 v2
γ=q ≈1+ vc (7.11)
1− v2 2 c2
c2
r
1 v2 1 v2
= 1− ≈ 1 − vc (7.12)
γ c2 2 c2
0
Using these approximations in Eq. 7.10, we can come up with a simple expression for λO :
r
0 q v 2 1
λO = O 1− 2 − q (7.13)
l c v2
1 − c2
1 v2 1 v2
q
= O 1− − 1 + (7.14)
l 2 c2 2 c2
q v 2
=− O 2 (7.15)
l c
Now that we have the charge density of the wire as viewed from the test charge’s frame, what
is the electrostatic force? The problem is now to find the electric field at a distance R from a long,
0
uniformly charged wire of charge density λO , which we already did in Sec. 3.8.6. Using Eq. 3.28, we
can immediately write down the electrostatic force experienced by the test charge in its reference
frame:
O 0 2
|~ ~ = Q · 2ke |λ | = 2ke Qqv
F| = Q|E| (7.16)
R Rlc2
iii
These approximations come from a Taylor series expansion. Don’t worry if you don’t know how to derive them.
We can simplify this a bit. The current in the wire is the charge q divided by the time it takes the
charges to move a unit length, which is ∆t = l/v.iv Thus the current can be written as qv/l:
2ke I
|~
F| = Qv (7.17)
c2 R
If we associate the quantity in parenthesis with an effective magnetic field, then we have derived
Eq. 7.1:v
|~ ~
F| = Qv|B| (7.18)
~ = 2ke I
with |B| (7.19)
c2 R
This is it. A test charge moving near a current-carrying wire experiences a net force proportional
to its charge, velocity, and the current in the wire. We have managed to derive the existence of the
magnetic field and magnetic force from nothing more than Coulomb’s lawvi and special relativity – a
magnetic field is nothing more than the field of moving charges. Further, by analogy with
Eq. 7.1, we have established that there is a magnetic field surrounding a long, straight wire. This is
perhaps the most important result – electric currents create magnetic fields. Electricity and
magnetism really are the same thing viewed from different reference frames. Amazing, isn’t it?
In some sense, it is remarkable that we can measure magnetic forces due to currents at all. The
drift velocity is miniscule compared to c, vc ∼ 10−12 or so, and γ is barely different from 1, about
1.0+10−24 . The magnetic force results from a tiny relativistic correction, certainly, but it is indeed
a significant effect in the end because there are truly astronomical numbers of charges per unit
length inside conductors. Even though the force per charge is miniscule, they make up for it in
numbers. Before moving on, we note that if you repeat this analysis for the more complicated case
that the test charge’s velocity is not the same as the charges in the wire, and not parallel, you still
arrive at exactly Eq. 7.1. It just takes quite a bit longer . . .
it must be radially symmetric about the wire axis as well – in other words, the magnetic field must
be constant in magnitude on circles drawn around the wire.
The force itself is directed perpendicular and toward the wire, and perpendicular to the test
charge’s velocity. If the force is proportional to and perpendicular to the velocity, and proportional
to the magnitude of the magnetic field, the force can only result from a vector product (or “cross
product”) between the velocity and magnetic field, ~ F = Q~ ~ A cross product between ~
v × B. v and
~
B fulfills all the requirements – if the force and magnetic field are perpendicular, the magnetic
field must be perpendicular to both. For this to be true and still have a radially symmetric field as
required by symmetry, there is only one possibility: the magnetic field circulates around the wire!
This is shown schematically in Figure 7.6.
Now we have a simple mathematical form of the magnetic field surrounding a current-carrying
wire:
Magnetic field around a long, straight wire:
~ = 2ke I θ̂ = µ0 I θ̂
B (7.20)
c2 R 2πr
where I is the current in the wire, r is the distance from the axis of the wire, and θ̂ is
the angular unit vector around the wire axis.
Just like electric field and electric potential near a point charge, B diverges when you get
infinitesimally close to the wire. The magnetic field doesn’t really become infinite, that just means
that when we get too close, we are actually inside the wire, and different physics must be used.
The µ0 in Equation 7.20 a new constant, the “permeability of free space,” and has the value
The constants µ0 , 0 , and the speed of light c are intimately related, so you really only have to
remember two of the three:
1
µ0 = (7.22)
0 c2
This is the reason that µ0 is defined by Equation 7.21 using “≡” instead of “=” – the interde-
pendence of these three constants led physicists to just define µ0 as fixed, since c and 0 determine
it uniquely anyway. If you substitute ke = 1/4π0 and 0 c2 = 1/µ into Eq. 7.20 above, you can see
that both forms are correct:
~ = 2ke I 2I 2I µ I
|B| 2
= 2
= 2
= 0 (7.23)
c R 4π0 c R 24π0 c R 2πR
Now there is just one more nagging point about the field surrounding the wire. Which direction
does it circulate, clockwise, or counterclockwise?
7.1.4 Handedness
What we do not fully know yet is the proper sense of circulation of the magnetic field surrounding
a current-carrying wire. In order to determine that, we need to think a bit deeper about three-
dimensional geometry and “handedness.”
~ ~
The fact that B, v, and ~FB are mutually perpendicular implies a unique axis for each, since
in three dimensions there are only three mutually perpendicular axes. This fact alone does not
determine a unique direction for all three, however. We have two possible choices for the convention
of direction, corresponding to two senses of “handedness,” or two possible coordinate systems, as
shown in Fig. 7.7. You may recall the same problem when learning about torque and angular
momentum. Or, if you are a chemist, you know this problem as chirality. An object is said to
be ‘chiral’ if its mirror image cannot be superimposed on the original. No amount of rotation or
translation will make the mirror image look exactly like the original. Your hands are good examples
- no amount of rotation or manipulation will change a left hand into a right hand, hence the name.
This is clearly the case for the the two coordinate systems in Fig. 7.7a and Fig. 7.7b, or the two
helixes in Fig. 7.7c.
The magnetic force is in some sense chiral. Looking back at Fig. 7.4, if we were to reverse the
direction of ~ ~ but not ~
v, then we would also have to reverse the direction of B, FB . Similarly, we
~ ~ vii
could reverse both v and FB and B would be left unchanged. The diagram of B, ~ ~ v, and ~
FB in
vii
This is a result of the fact that the magnetic field is technically a pseudovector, not a true vector. Pseudovectors
act just like real vectors, except they gain a sign flip under improper rotation. 28 An improper rotation is an inversion
followed by a normal (proper) rotation, just what we are doing when we switch between right- and left-handed
coordinate systems. A proper rotation has no inversion step, just rotation.
Fig. 7.4 is not equivalent to its mirror image, and is hence chiral. We will not dwell on this point,
further, but suffice it to say, as a convention we always choose the right handed coordinate system.
ẑ
(a)
ŷ (c)
Figure 7.7: (a) A right-handed coordinate system (b) A
x̂ left-handed coordinate system. Can you see that a and b
RH are not equivalent? Try rotating them in your head. (c)
ẑ A right-handed and a left-handed helix. Normal DNA is
(b) (d) a right-handed helix, though left-handed DNA does exist.
No definitive biological significance of “left-handed”-DNA
ŷ has yet been shown.
x̂ LH RH
We can easily pick which is the right-handed coordinate system and choose the proper directions
~ v, and ~
of B, ~ FB , with a simple rule, the right-hand rule number 1:
Right-hand rule # 1:
1. Point the fingers of your right hand along the direction of the velocity.
~
2. Point your thumb in the direction of the magnetic field B.
3. The magnetic force on a positive charge points out from the back of your hand.
-OR-
Both forms of the right-hand rule (should) give you the same result, use whichever is more
intuitive for you. Note that if you replace ~ ~ with y, and ~
v with x, B FB with z, the same rules let
you choose a right-handed coordinate system. For a current-carrying wire, we can come up with a
more specific rule, since the velocity of the charges making up the current is always along the axis
of the wire. This rule is unimaginatively called the second right-hand rule:
Question: Consider a proton moving with a speed of 1 · 105 m/s through the earth’s
~ = 55 µT). When the proton moves east, the magnetic force acts
magnetic field (|B|
straight upward. When the proton moves northward, no force acts on it. What is the
direction and magnitude of the magnetic field?
Answer: First, the lack of a magnetic force when the proton moves north means that
the magnetic field must be pointing either north or south – there is zero force only
when velocity and magnetic field are parallel. The right-hand rule tells us that since the
net force is upward, and the velocity is eastward, the magnetic field must be pointing
north. (See Figure 7.4a.) The magnitude of the magnetic force is readily calculated from
Equation 7.1:
|~ ~ sin θvB = (1.6 · 10−19 C)(1 · 105 m/s)(55 × 10−6 T) sin 90◦ = 8.8 · 10−19 N
v||B|
F| = q|~
Equation 7.20 lets us calculate the magnetic field due to a long, straight wire, but not much else.
Deriving everything from electrostatics and special relativity is certainly too tedious for common
usage. A more general technique is due to André-Marie Ampère, it is much in the spirit of Gauss’
law (Sect. 3.8). “Ampère’s law” relates the current flowing though a closed surface to the magnetic
field tangential to the curve bounding the surface.
!
arbitrary B|| ∆l = µ0 I
closed path I
Figure 7.8: Ampère’s Law. Take any arbitrary
closed path surrounding a current, and break it up
into infinitesimal segments ∆l, and find the compo-
nent of the magnetic field parallel to the segment
Bk . The sum of all such products around the closed
∆l path gives the current passing through the surface
surface bounded bounded by the path.
by closed path B||
B⊥
!
B
Take any arbitrary closed path surrounding a current, as in Figure 7.8, and break it up into
infinitesimal segments ∆l. Now find the component of the magnetic field parallel to the segment,
Bk , and compute the product Bk ∆l. The sum of all such products around the closed path gives the
current passing through the surface bounded by the path:
Ampère’s law:
X
Bk ∆l = µ0 Ienclosed (7.24)
closed path
where Bk is the field component parallel to the segment ∆l, and Ienclosed is the current
passing through the surface defined by the closed path.
Again, just like with Gauss’ law, we choose particularly convenient paths around a current, such
that everywhere on the path B is either perfectly parallel, or perfectly perpendicular. Unlike Gauss’
law, we have to be careful about the direction in which we trace out the path.
Take the long, straight wire carrying a current I. We know from symmetry that the magnetic
field must be radially symmetric about the wire, so we choose our Ampérian paths to be circles
centered on the wire, just like we chose spheres as our Gaussian surfaces surrounding point charges
(Sec. 3.8).
By symmetry, the magnetic field has the same value everywhere on the circle, and must be
tangential to the circle. That is, Bk = B for every segment ∆l on the wire. Computing the current
is now easy, since Bk can just be taken out of the sum:
X X
Bk ∆l = Bk ∆l = Bk lpath = Bk · 2πr = µ0 I (7.25)
µ0 I
=⇒ Bk = (7.26)
2πr
This is exactly the result given by Equation 7.20, which we derived from Ampère’s law and
symmetry alone! While Ampère’s law is very simple and elegant, it cannot easily be used for
complex current configurations which lack a nice symmetry like our wire has, and it is only valid
for static cases, when the E and B fields do not vary with time. There, however, is a slightly more
complex form which is valid in general. It relates not only current, but the time variation of the
P
electric field to Bk ∆l.
the case of magnetic fields, the same law applies, but we know there are no unpaired “magnetic
charges” - magnets always come in north-south pairs. Therefore, any closed surface always encloses
pairs of magnetic poles, and there can be no net “magnetic charge” inside. Thus the net magnetic
flux ΦB = BA cos θBA , defined similarly to electric flux (Equation 3.5) out of any closed surface
bounding a volume is zero.
where θBA is the angle between the surface normal and the magnetic field.
Given any volume element, the net magnitude of the vector components of the magnetic field
that point outward from the surface must be equal to the net magnitude of the vector components
that point inward. This means that the magnetic field lines must be closed loops. Another way
of putting it is that magnetic field lines cannot originate from somewhere – following the lines
backwards or forward leads back to the starting position. Hence, this is a mathematical formulation
of the statement that there are no single magnetic poles. Magnetic poles always come in north-south
pairs, never alone.
By analogy, the net magnitude of the vector components of the electric field pointing outward
must be equal to the net magnitude of the vector components pointing inward plus the amount of
free charge inside. Electric field lines do originate from somewhere - from charges.
Gauss’ laws:
The electric flux ΦE through any closed surface is equal to the net charge inside the
surface, Qinside , divided by 0 :
Qinside
ΦE,closed surface = (7.28)
0
The magnetic flux ΦM through any closed surface bounding a volume is zero:
The fact that magnetic flux out of a closed surface is zero gives us gives us Ampère’s law. If
there can be no net magnetic flux out of a closed region, then the tangential components of the
magnetic field around any closed curve we draw on the surface must sum to zero. If they did not,
then adding up all such curves to build up a closed volume would not lead to zero magnetic flux,
which would imply the existence of single magnetic poles. For even more detail about what this
means for the boundary conditions on the electric magnetic fields, see Appendix B.
What about a version of Ampère’s law for electric fields? Surely Gauss’ law for electric fields
must imply something about the tangential components of the electric field around a closed loop.
Indeed they do, and it is this bit which explains how electric generators work. But not until next
chapter!
We have already found that a charged particle moving parallel to a current-carrying wire experiences
a force directed toward the wire. Now, we wish to consider the slightly more general case of a
single charged particle +q placed in a constant magnetic field, such that the particle’s velocity ~
v
~ Figure 7.9. We know that the magnetic force ~
is perpendicular to the magnetic field B, FB will
always be perpendicular to ~v and perpendicular to the field B.~ What is the resulting motion of
the particle?
q Bin
X X X X X X
v +
X X X X X X
F r v Figure 7.9: When the velocity of a charge +q is
X X X X X X perpendicular to a uniform magnetic field, the par-
F ticle moves in a circle whose plane is perpendicular
+ q ~ which is into the page. The magnetic force F
to B, ~
X X X X X X on the charge is always directed toward the center
of the circle.
X X X X X X
F
X X X +X v X X
q
Take the case of the particle at the bottom of the circle in the figure, where the particle has a
velocity directed to the right. Applying the right-hand rule gives a force vertically upward. The
particle curves upward as a result, and then experiences a force to the left. And so on.
More generally we might ask: what is the locus of points such that the force, velocity, and mag-
netic field are always perpendicular? A circle! The magnetic force is always directed toward
the center of a circular path, therefore the magnetic force causes a centripetal acceleration. As
we know, whenever a particle moves in a circular path, it experiences an effective centripetal force
mv 2 /r, which must equal the sum of all other forces. Centripetal force changes the direction of ~
v,
but not its magnitude, so we can relate it to the magnetic force with Newton’s second law:
mv 2
FB = Fcentr. = qvB = (7.30)
r
We can use this to find the radius of the path of a charged particle in a magnetic field:
mv
r= (7.31)
qB
The radius of the particle’s path is proportional to its momentum mv, and inversely proportional
to its charge q and the magnitude of the magnetic field B. Equivalently, we can say the radius
depends on the charge to mass ratio of the particle, m/q.
If we know the radius of the particle’s path, then Equation 7.30 says that the velocity has to be
v = qrB
m . Since the particle is in uniform circular motion, we can define an angular frequency ω,
the time it takes the particle to go around one orbit:
where we have used Equations 7.30 and 7.31 in the last two steps. The period of the motion can
be found as well:
2πr 2π 2πm
T = = = (7.33)
v ω qB
In other words, the charged particle undergoes oscillatory motion, with a period proportional to
the mass to charge ratio m/q, and inversely proportional to the magnetic field B. This is roughly
the basis of one type of magnetic resonance, similar to MRI.
What happens if the initial velocity is not perfectly perpendicular to the magnetic field? The
motion of the particle within the plane perpendicular to the field is still a circle, but we have to
add on a constant component in the direction parallel to the field. Circular motion in one plane,
and constant velocity in a perpendicular plane gives a helix. Think about that for a minute, and
inspect Figure 7.10.
Figure 7.11 illustrates the basic operation of one type of mass spectrometer. Charged particles
enter a region at left where two parallel plates create a constant electric field E in the vertical
direction, while at the same time a constant magnetic field is applied into the page. The electric
field causes a force qE to be exerted on the particle upward, while the magnetic field exerts a force
qvB downward.
y
helical path
+q
+ B
Figure 7.10: A charged particle with an initial
velocity at an angle to the magnetic field moves in
a helical path.
z x
If the net electric + magnetic force is zero, the particle has no acceleration, and travels on a
straight line path through a narrow aperture. For this to happen, we require:
E
qE = qvB ⇒ v= (7.34)
B
This is a “velocity selector,” which creates a stream of particles of a specific velocity based on
the ratio E/B. Once the particles leave the aperture, the experience only a magnetic field, and
therefore only a force qvB directed initially downward. From the preceding section, we know that
the particle’s subsequent motion will be in a circular path of radius
mv
r= (7.35)
qB
If we solve Equation 7.34 for B, and substitute it in the equation above, we see that the radius
mE m E
r= = (7.36)
qB 2 q B2
After the first part of the detector fine-tunes the particles’ velocity, the second stage forces them
to curve in a path that depends on their mass to charge ratio m/q. If before the detector stage
we ensure that all particles are singly-charged, or at least all have the same charge, the radius of
curvature is directly related to the mass of the particle. The radius of curvature, and thus the
mass of the particles, can be measured by placing a position-sensitive charge detector (green box)
inside the second stage of the detector. Heavier particles curve less in the magnetic field, and land
farther along the green plate, while lighter ones curl in tightly and land closer to the left side of
the detector plate. Mass spectrometers based on this principle can be used to identify elements or
compounds (as in a mass spectrometer), or to separate isotopes of a given element.
We know already how a single charged particle moves in response to a magnetic field. We also
know that an electric current is nothing more than a stream of moving charges. It is easy to see,
then, that a wire carrying a current should experience a force in a magnetic field. The direction of
the force is perpendicular to the direction of the current and the magnetic field, in agreement with
the first right-hand rule.
What is the force on a current carrying wire? Let us take a wire of length l, carrying a current I
~ ≡ B perpendicular to the wire’s axis. If we break the current up
in a magnetic field of strength |B|
into single charges moving at the drift velocity |~
v| ≡ vd , then each charge making up the current
~
experiences a magnetic force |FB | = qvd B. The total force on a segment of wire is the force per
charge carrier times the total number of carriers in the wire. Given the number of carriers per unit
volume n, and the wire’s volume A · l, we have:
We already developed Equation 5.5 relating the current to drift velocity, I = nqvd A, so we are
left with:
I
vd = (7.38)
nqA
|~
FB | = nqvd BAl (7.39)
I
= nq BAl (7.40)
nqA
= IBl (7.41)
If the wire isn’t perpendicular to the magnetic field, but at some angle θ, we can repeat the
analysis above with B sin θ in place of B, and arrive at a general equation for the force experienced
by current carrying wire:
Force on a current-carrying wire:
|~
FB | = IBl sin θ (7.42)
magnetic fields. From this it also follows that a current-carrying wire experiences a force from
another current-carrying wire.
l
I1 1
Figure 7.13: Two parallel wires both carry steady
B2 F12 d currents and exert a force on each other. The field
~ 2 at wire 1 due to wire 2 produces a force on wire
B
I2 2
1 given by F12 = B2 I1 l. The force is attractive if
the currents are in the same direction (shown), and
repulsive if they are in opposite directions.
Figure 7.13 shows two long, straight, parallel wires carrying currents I1 and I2 separated by a
distance d. Wire 2 produces a magnetic field B ~ 2 , which acts on wire 1. The direction of B
~ 2 is
perpendicular to the wire, and must have a magnitude:
~2 | = µ0 I2
|B (7.43)
2πd
µ0 I2 µ0 I1 I2 l
|~ ~2 |I1 l =
F12 | = |B I1 l = (7.44)
2πd 2πd
For an arbitrary wire, we can better write this in terms of the force per unit length:
|~
F12 | µ I1 I2
= 0 (7.45)
l 2πd
where d is the separation between wires 1 and 2 carrying currents I1 and I2 , respectively.
The direction of ~
F12 is downward toward wire 2, as expected from the first right-hand rule.
From Newton’s third law, we additionally know that ~
F12 = −~F21 , that is, the force on wire 2 is
equal and opposite that on wire 1.
Two parallel wires carrying current in the same direction attract each other, and as you might
expect, when the currents are in the opposite direction they repel one another. The reason for the
attractive force
(a)
X X
Figure 7.14: The magnetic field lines around
two parallel current-carrying wires. (a) When both
currents are into the page, the field lines circulate
clockwise for both wires, and in the region between
the wires they tend to cancel one another. The net
(b) interaction is attractive. (b) When the currents
are in opposite directions, the field is reinforced
between the wires, since one has a clockwise and
one a counter-clockwise circulating field. The net
X interaction is repulsive.
repulsive force
force being attractive for currents in the same direction and repulsive for opposing currents relates
to the magnetic field lines in the region between the two wires, as shown in Figure 7.14. When the
currents are in the same direction, the field lines tend to cancel between the wires, which leads to
an attractive force. If the currents are identical, the force is exactly zero on a line halfway between
the wires. When the currents oppose, the field lines reinforce between the wires, enhancing the
field and leading to a repulsive force.
Now we know how to find the force on a straight length of current-carrying wire in magnetic field.
From there, it is no big trick to show that a loop of wire in a magnetic field experiences a torque.
This result will be crucial to understanding how, e.g., electric motors and generators function in
the next chapter.
(a) I (b) F1 a
2
B X
B Figure 7.15: (a) Top view of a rectangular loop
F1 F2 F2
carrying a current I in a magnetic field B. ~ No
X
b magnetic forces act on the sides of length a par-
~ since they are parallel to the field, but
allel to B,
F1 forces do at on the sides of length b, since they are
(c) at right angles to the field. (b) A side view of the
loop, showing that the forces F ~ 1 and F~ 2 on the b
a/2 sides create a torque that tends to rotate the loop
θ ~ is at an angle θ with a line per-
clockwise. (c) If B
a sin θ pendicular to the loop plane, the torque is BIA sin θ
a 2 X where A is the area of the loop (i.e., b · a).
z Xy y z F2
x x B
Take the loop of wire carrying a current I in a constant magnetic field B ~ in Figure 7.15a.
No magnetic forces act on the sides of length a parallel to B, ~ since they are parallel to the field
(sin θ = 0). We do expect forces to act on the sides of length b, however, since they are at right
angles to the field. Further, since the sides are identical except for the fact that the current is in
opposite directions, we expect that they experience the same magnitude of force (Equation 7.42),
but in opposite directions:
|~
F1 | = |~
F2 | = BIb (7.46)
From right-hand rule #1, the force on the left side of the loop, ~ F1 has to be out of the page,
while the force on the right side of the loop, ~
F2 , has to be into the page. If we fix the loop such
that it pivots along a vertical axis running through the middle of the loop (the dashed line in
Figure 7.15b), what will happen?
Figure 7.15b shows the loop viewed on edge. Both forces try to rotate the loop clockwise about
the pivot axis. Recall that a torque ~ τ occurs when we have a force F applied some distance d from a
pivot point, and |~
τ | = F d sin θF d , where θF d is the angle between the force and the displacement to
the pivot point. Consistent with our right-handed coordinante system, positive torque corresponds
to clockwise rotation.
The forces ~F1 and ~ F2 are applied at a distance a/2 from the loop’s pivot point, and the angle
between the force and displacement is 90◦ , so the net torque is:
a a a a
|~
τ |max = F1 + F2 = (BIb) + (BIb) = BIab (7.47)
2 2 2 2
The area of the loop is A = ab, so we can express the torque more generally as
|~
τ |max = BIA (7.48)
This simple result only holds when the field B~ is parallel to the plane of the loop. We can easily
repeat the for the case when the field makes an angle θ with a line perpendicular to the plane of
the loop, as shown in Figure 7.15c. All we have to do is change B to B sin θ - the torque only results
from the component of B ~ parallel to the loop plane:
|~
τ | = BIA sin θ (7.49)
The loop has a maximum torque BIA when the field is parallel to the plane of the loop, and is
zero when the field is perpendicular to the plane of the loop. When placed in a magnetic field, the
loop will tend to rotate to smaller values of θ, until its plane is perpendicular to the loop (or such
that its area normal is parallel to the field), minimizing the torque it feels. What good is all this?
The torque created on a current loop by a magnetic field is the basis of many electric motors!
We can further generalize our result and consider not just one loop of wire, but N loops of wire
tied together – a coil. All we have to do is add together the magnitude of the N torques |~τ | from
each loop, since they all act in the same direction:
|~
τ | = BIAN sin θ (7.50)
where I is the current carried by a each loop of area A in a coil of N turns, placed in a
constant magnetic field of magnitude B, and θ is the angle between a line perpendicular
to the loop and B.
This simple and most general result holds for coils of of arbitrary shape, not just rectangles, so
long as the loop can be contained by a cartesian plane. Since the problem of current loops comes up
fairly often, we often define the quantity IAN to be the magnitude of the magnetic moment of the
coil |~µ|. The magnetic moment vector ~µ always points perpendicular to the plane of the coil, and
the angle θ is now the angle between the magnetic moment and the field B. Using this definition:
|~ ~ sin θ
τ | = |~µ||B| (7.51)
As a last remark, we point out that an electron orbiting an atomic nucleus can be thought of as a
current loop, which implies that atoms would experience a torque when placed in a magnetic field.
In a rough manner of speaking, this is the basis for Magnetic Resonance Imaging (MRI), the actual
details of which are beyond our discussion. Magnetic resonance deals with the magnetic moments
of individual electrons or protons, which is actually due to their quantum-mechanical spin, a topic
we will cover in quantum physics.
The magnetic field produced by a current-carrying wire can be magnified at a point by bending the
wire into a loop. Consider the loop in Figure 7.16. The small segment of the loop ∆x1 produces a
magnetic field at the loop’s center which is directed out of the page. The segment ∆x2 also produces
a magnetic field directed out of the page, which adds to the field from segment ∆x1 . This occurs
for every tiny segment of the whole loop, with the result that the field at the center is much larger
than anywhere else.
The magnetic field at the center of a loop of radius R carrying a current I is given by
~ center = µ0 I ẑ
B (7.52)
2R
(a) (b)
N N
S S
We can make current loops into a longer “bar magnet” by adding them together. If we make a
coil of N equivalent loops of wire stacked together, each carrying a current I, the field at the center
~ center = N µ0 I
B ẑ (7.53)
2R
In other words, if bending a wire into a single loop enhances the field maximally, then the next
best thing is to just add more loops. The field from every loop just adds to the total, so long as
the currents are all running in the same direction.
7.3.5.1 Solenoids
Instead of stacking individual loops, we can take a long straight wire and bend it into a coil. Such
a coil is called a solenoid, a type of electromagnet. Solenoids are important because they act as
magnets only when current is supplied (there is no remnant field, like a permanent magnet has),
and create an extremely uniform field inside them. A solenoid is one form of an electromagnet –
solenoids using superconducting wire are crucial for creating the large magnetic fields required for
Magnetic Resonance Imaging.
Figure 7.18 shows a schematic of a solenoid and
its magnetic field lines. The conductors going into
and out of the page carry a current I. The field lines
I inside the solenoid are very nearly parallel, uniformly
spaced, and close together. Subsequently, the field
inside is strong - being the superposition of the field
X X X X X X X of many individual coils - and very uniform. Note
I how the solenoid looks just like a long bar magnet
now – again, they are nearly indistinguishable.
The field outside the solenoid is weaker, nonuni-
Figure 7.18: Magnetic field lines around a solenoid. form, and in the opposite direction. We can make the
The field is nearly uniform inside if we are far from the field inside more and more uniform by adding more
edges, and small outside.
and more coils, making the solenoid longer. If the
solenoid is long compared to its diameter, the field
will be very uniform toward the middle.
What is the field inside the solenoid? We can use Ampère’s law (Equation 7.24) to find out. Let
us imagine that the total number of turns is N , and the length l. Take a closed loop for Ampère’s
law like the loop labeled “1” in Figure 7.19. We will consider the solenoid to be so long that the
field outside is essentially zero. Ampère’s law tells us to sum up Bk ∆l around this loop. Since the
field is constant on each side of rectangle 1 (though not the same on every side), we can just sum
up Bk ∆l for each side. The contribution from the top and bottom sides is clearly zero - the field is
perpendicular to the length there. The contribution from the outside edge is also zero, since B ≈ 0
there. The only non-zero contribution is from the inner side of the rectangle:
X
Bk ∆l = Bz L = µ0 Ienclosed (7.54)
path 1
The right-hand side is total current that passes through rectangle 1. If there are N loops over
the length l, the current enclosed is just N I.
~ = µ N I ≡ µ nI ẑ
B (7.55)
0 0
L
where ẑ is on the axis of the solenoid, N is the number of turns of wire each carrying
a current I, and L is the length of the solenoid (so there are n ≡ N/L turns per unit
length).
X X X X X X X
law to loop 2? Again, the top and bottom sides give
no contribution, since the field is perpendicular. The 2
3
left and right sides experience the same (parallel)
field B, but on the right side the length vector is in 1
l
the same direction as B, ~ while on the left side it is
opposing. This means that one side gives a positive
contribution, and the other an equivalent negative
P
contribution. So Bk ∆l = 0, which must be true
Figure 7.19: Ampère’s law paths for a solenoid.
since Ienclosed = 0 for this path!
Path 3 is even easier - if the solenoid is long
enough to neglect the field outside, then the contribution from every side is zero. Again we have
P
Bk ∆l = 0, and Ienclosed = 0, consistent with Ampère’s law.
they also behave as if they are spinning like a top. (This analogy should not be taken literally,
the “true” explanation results from quantum mechanical phenomena.) This spinning motion also
represents moving charge, and with it is also associated a magnetic moment.
Electrons tend to group in pairs such that their “spin” magnetic moments cancel – you might
remember this as Hund’s rule from chemistry. As a result, materials with an even number of
electrons tend not to be strongly magnetic. If there are an odd number of electrons, a net magnetic
moment results. Each one of the N unpaired electrons in a magnetic material possesses a magnetic
moment ~µ. If the material is a permanent magnet, all of the individual moments tend to line up in
the same direction spontaneously, and they add together to form a very large field. If the materials
is magnetic, but not a permanent magnet (proceeding section), the moments do not spontaneously
align, but can be forced into alignment with a small external magnetic field.
If we define the number of unpaired electrons per unit volume is n ≡ N/V , then the quantity
~
nµ ≡ M ~ is called the magnetization of the material, or the magnetic moment per unit volume. The
quantity µr is the relative permeability of the material, just like µ0 is the permeability of vacuum.
The net result of this is that magnetic materials behave as if there is a large magnetic field present
inside them, in addition to the external field. This internal magnetic field has a maximum value
when the material is fully magnetized, known as the “saturation magnetization” of the material.
The saturation magnetization can be the equivalent of hundreds of teslas in common magnetic
materials!
~ inside = µ B
B ~ (7.56)
r external
phenomena known as hysteresis, which is the basis for magnetic information storage in hard disks.
We have necessarily left out a great many fundamental details about permanent magnetic ma-
terials. A good introductory place to learn more is: http://hyperphysics.phy-astr.gsu.edu/
hbase/solids/magperm.html.
7.4.2 Electromagnets
Now we can understand a bit how strong electromagnets work. Figure 7.21 shows a cross-section
of an electromagnet. The permanent magnetic material (Iron, for example) is in the shape of an
“O” with one small notch cut out of it. Wrapped around the closed end opposite the notch is a
coil of copper wire of length L (running into and out of the page) with N turns each carrying a
current I.
iron
µr XXXXXX
“solenoid” coil
current
N windings
I
What happens in this construction? The current in the “solenoid” coil creates a magnetic field
of µ0 N I/L in the left-to-right direction. This relatively small magnetic field serves to magnetize
the iron core, such that the field inside the core is µr times the field from the copper coil: Binside ≈
µr µ0 N I/L. So what is the field inside the gap? One can use Ampère’s law for that, or the boundary
conditions on the magnetic field (Appendix B) but we will only quote the result for the field inside
the gap here:
N
Bgap ≈ µr µ0 I (7.57)
L
So long as the gap is very narrow compared to the size of the core itself, the field is just about
the same as that inside the magnetic material. Just like inside the core itself, the field in the gap
is enhanced by a factor µr , so good electromagnet cores are made from materials with very high
µr . Table 7.1 lists the relative permeability for a few permanent magnetic materials. 29 Given that
µr can be thousands or hundreds of thousands, the reason for having a core in a electromagnet is
clear - it is a magnetic field amplifier!
Table 7.1: Relative Permeabilities and Remnant Fields of Some Magnetic Materials 29
(S), just like a positively charged rod induces a negative charge on a conductor (Sect. 3.2.2). The
difference is, again, that magnetic poles only come in pairs, so there is no magnetic version of
charging by conduction (Sect. 3.2.1).
North
South
East
West
2.2 · 10−9 N
6.6 · 10−15 N
8.8 · 10−19 N
4.4 · 10−13 N
4. Once the particle enters the second region of the detector from the previous question, it is
in a region of magnetic field only. In this region, the particle travels in a circular path. What
is the radius of the circle?
r = mB/qv
r = qvB/m
r = qB/mv
r = mv/qB
6. An electron passes through a magnetic field without being deflected. What can you say
about the angle between the magnetic field vector and the electron’s velocity, if no other forces
are present?
7. What should happen to the length of a spring if a large current passes through it? Hint:
Think about the current in neighboring spring coils.
It shortens
It lengthens
Nothing
7.6 Problems
1. Sodium ions (Na+ ) move at 0.85 m/s through a bloodstream in the arm of a person standing
~ = 1.2 T and makes an angle of 73◦
near a large magnet. The magnetic field has a strength of |B|
with the motion of the sodium ions. The arm contains 120 cm3 of blood, with 3.0 × 1020 Na+
ions per cubic centimeter.
If no other ions were present in the arm, what would be the magnetic force on the arm?
E
+ + + + + + + + Bin 3. Consider the mass spectrometer
-q X X X X X X X X X
v shown at left. The electric field between
the plates of the velocity selector is
X X X X X X X X X ~ = 1000 V/m, and the magnetic fields
|E|
- - - - - - - - in both the velocity selector and the
X X X X X deflection chamber have magnitudes of
1.0 T.
X X X X X
Calculate the radius of the circular path
X X X X X in the deflection chamber for a singly
charged ion with mass m = 7.3 × 10−26 kg
(corresponding to CO2 ).
4. An electron has a velocity of 3 × 106 m/s perpendicular to a magnetic field and is observed
to move in a circle of radius 0.3 m.
in a straight line instead? Give the magnitude and direction (relative to the B field and the
electron’s velocity).
5. A wire with a weight per unit length of 0.10 N/m is suspended directly above a second wire.
The top wire carries a current of 30 A and the bottom wire carries a current of 60 A. Find the
distance of separation between the wires so that the top wire will be held in place by magnetic
repulsion.
1. North. The proton will experience no force when it is moving in a direction parallel to the
magnetic field. We already know then that the magnetic field is either pointing north or south,
since the proton experiences no force when traveling north. But is it north or south?
When the proton moves east, it experiences a force upward. We can use the first right-hand
~ field. Put the fingers of your right hand along the
rule to find definitively the direction of the B
proton’s velocity (east), and point the back of your hand in the direction of the resulting force
(up). Your right thumb now points along the direction of B ~ - north.
F| = (1.6 × 10−19 C)(1 × 105 m/s)(55 µT) = (1.6 × 10−19 C)(1 × 105 m/s)(55 × 10−6 T) = 8.8 × 10−19 N
|~
Strictly speaking, we have to note that 1 T = 1 N/A·m to be sure the units come out right.
So long as you use the proper SI units for everything - C, m/s, T - you can usually be sure
everything will work out all right.
3. E/B = v For the particle to make it through the aperture, it has to travel in a straight line.
This will only happen if there is no net up-down force on the particle.
~ field gives an upward force on the negative charge −q, while the B
We can see that the E ~ field
gives a downward force from the first right-hand rule. For there to be no net force, these two
have to balance:
|~
FB | = |~
FE |
−qvB = −qE
−qvB
= −qE
v = E/B
4. r = mv/qB. If the particle moves in a circular path, the net force it experiences must be
equal to the centripetal force:
|~
FB | = |~
FC |
qvB = mv 2 /r
rq vB
= mv 2
rqB = mv
r = mv/qB
5. No, because there are no single magnetic charges. I suggest you read the section in
Chapter 15 on charging by induction once more, and the answer should be clear.
6. Both the first and third are possible. The magnetic force is ~ ~ or FB = qvB sin θ
v×B,
FB =~
where θ is the angle between ~ ~
v and B, so the magnetic force is always perpendicular to the
electron’s velocity. The only way the electron can go through the region of magnetic field and
experience no deflection is if it feels no force - a deflection from a straight line path implies
an acceleration, which implies a force. This can only be true if θ is 0 or 180◦ - the electron’s
velocity and the magnetic field vector have to be parallel, or in opposite directions.
7. It shortens. Think about the individual coils making up a spring. The current through
segments of adjacent coils are parallel, and hence adjacent sections of the spring coils should
experience an attractive force. Each coil of the spring attracts every other one, and the net
result is that the spring should shorten.
1. ≈ 5630 N. We have a stream of singly-charged Na+ ions (i.e., q = e = 1.6 × 10−19 C) moving
at a velocity v at an angle θ to a magnetic field B. We know the force on a single ion is:
The total force is just the force per ion times the number of ions. We have a density of ions
ρN a+ = 3 × 1020 cm3 , and a volume of V = 120 cm3 . The total number of ions is then just
NN a+ = V ρN a+ = 3.6 × 1022 . Note that we don’t really have to convert cm3 to m3 , since the
units cancel. Put that all together:
2. 67.8 A. Basically, we have two current-carrying wires, both carrying the same current I,
and we know the magnitude of the force per unit length between them must be:
Fm µ I2
= 0 (7.58)
l 2πd
Here d is the lateral separation of the wires. If the wires are hanging as shown, then the magnetic
force between the wires must be balancing the weight of each wire and the tension in the strings
holding them up. Therefore the magnetic force must be repulsive, and the currents in
the opposite direction.
Notice the free-body diagram in the upper right corner. In equilibrium, the sum of all forces
is zero. Take +x to the right, and +y upward. Then we can write down the net forces in the
x and y directions. For convenience, we will say from now on θ0 ≡ θ/2, since we only
need the half angle to do the problem.
Fm m µ I2
= g tan θ0 = gλ tan θ0 = 0
l l 2πd
Note the substitution λ = m/l above. Now the question is, what is d, the separation of the
wires? Simple plane geometry relates the separation d to the length of the support wires (the
6 cm, we’ll call this h), and the angle θ0 : d = 2h sin θ0 . Put that into the equation above and
solve for I ...
|~
Fm | µ0 I 2 µ0 I 2
= λg tan θ0 = =
l 2πd 4πh sin θ0
4π
I2 = gλh sin θ0 tan θ0
µ0
Putting in the numbers given, you should get I ≈ 67.8 A. And as noted above, the currents are
in opposite directions.
3. r = 4.6 × 10−4 m. I think (hope?) you are all familiar with the mass spectrometer at
this point. In the left-most region, there are both E ~ and B~ fields. The electric force and the
magnetic force act in opposite directions. Since the ion’s velocity is constant, there must be
zero acceleration. If there is zero acceleration, the sum of all forces must be zero:
E
ΣF = Fe − Fm = qE − qvB = 0 ⇒ qE = qvB ⇒ v=
B
Next, in the region of purely magnetic field on the right, we have only a magnetic force. But, if
the path of the ion is circular, then the sum of all forces must equal the centripetal force:
mv 2 mv
ΣF = Fm = qvB = ⇒ r=
r qB
Now we can use the fact that v = E/B and simplify. Again we note that a singly-charged ion
has a charge q = e:
mv m E mE
r= = =
qB qB B qB 2
Plug in the numbers given (no unit conversions for once), and you find r = 4.6 × 10−4 m, or
r = 0.46 mm.
4. Since we have two parallel wires with currents flowing, we know we are going to have a mag-
netic force between. Now the problem says that the magnetic force is repulsive (which implies
that the currents are in opposite directions), which it must be in order for the the top wire to
be held in place against the force of gravity. This means we want to balance the gravitational
force on the top wire, acting downward, against the repulsive magnetic force between the two
wires, acting upward on the top wire.
The top wire is quoted to have a weight per unit length of 0.10 N/m, which we will call χ. A
weight is already a force, mass times gravity, so the problem gives you the gravitational force
per unit length χ = mg/l for some section of wire of length l and mass m. We can relate the
more common mass per unit length λ = m/l and the weight per unit length easily: λg = χ. Since
the force between two parallel current carrying wires is also expressed in terms of force per unit
length, we are nearly done.
Given the currents in the top and bottom wire (I1 and I2 , respectively), the weight per unit
length (χ), and the separation between the wires (d), we just have to set the weight per unit
length equal to the magnetic force per unit length, and solve for d:
Of course, the units work out much easier if you know that µ0 can be expressed in T·m/N
or N/A2 , the two are equivalent: µ0 = 4π × 10−7 T · m/A and µ0 = 4π × 10−7 N/A2 . This
equivalence makes some sense – the first set of units comes from thinking about the field created
by a current-carrying wire, while the second comes from thinking about the force between two
current-carrying wires.
5. COMING SOON!
6. There is nothing special to do here, except calculate the field at a given point due to each
individual wire, and add the results together to get the field due to all three wires. Of course,
you have to add the fields as vectors, which makes this much more fun. In order to do that,
let’s define our problem a bit better. Let’s call our y axis the line connecting point B with the
upper and lower wires, and our x axis the line connecting points A, B, and C. We will label
the upper-most wire “1”, the lower-most wire “2”, and the farthest right wire “3.”
We can notice right away that the current is out of the page for all three wires, which means that
the field will circulate counterclockwise, and be constant on circles centered around each wire.
Further, for any point along the x axis, the x components of the fields from wires 1 and 2 are
always going to be equal and opposite. This can be deduced from symmetry alone - they are at
the same x position, the same distance away from the x axis, and carry the same current in the
same direction. This means already that the field at all three points will have no y component.
But we get ahead of ourselves ...
Point A: First, we should think about the direction of the fields. Around each wire, draw a
circle centered on the wire, which goes through point A. Since the wires go out of the page, the
field is constant on this circle, and it circulates counterclockwise. The direction of the field is
tangent to the circle for each wire. Since the circles around wires 1 and 2 have the same radius,
this means that the fields from wires 1 and 2 have the same magnitude.
From the symmetry of the problem (or basic geometry), one can see that the x components of
the fields from wires 1 and 2 are equal and opposite, leaving only a downward component. The
field from wire 3 is purely down, and has no x component to begin with. Thus, the total field
is purely downard, and we only need to add up the y components of the fields from each wire.
The figure below might help you see this:
I1
a√2
a
A a a I3
B2 B1 a
B3
a
I2
We also know - again from symmetry or basic geometry - that the √ y components of the field
from wires 1 and 2 must be the same. Both wires are a distance a 2 from point A. First, let’s
calculate the magnitude of the field from wires 1 and 2:
~ 1,A | = |B
~ 2A | = µ0 I
|B √
2πa 2
The y component of the field is no problem - all the relevant angles are 45 and 90 degrees:
Again, the direction is purely downward in the −y direction. Take care that the distance a is
converted to meters ...
Point B: At point B, things are even simpler. Since point B is perfectly between wires 1 and
2, the fields from wires 1 and two perfectly cancel each other. All we are left with is the field
from wire 3, a distance 2a from point B, which is again purely in the −y direction:
−µ0 I 1 µo I
BB,tot = B3,B,y = =− ≈ −14 µT
2π · 2a 4 πa
Point C: This is similar to point A in fact. Again, the x components of the fields from wires
1 and 2 must cancel,
√ and again the field from wire 3 has only a y component. Wires 1 and 2
are a distance a 2 from point C, so the y components of their fields will be exactly the same
as they were for point A, except that now the field points up instead of down:
µ0 I
B1,C,y = B2,C,y =
4πa
The field from wire 3 still points down, and has only a y component as well. Wire 3 is a distance
a from point C, so its field is:
−µ0 I
B3,C,y =
2πa
If we add up the total field, which still has only y components:
µ0 I µ0 I −µ0 I 1 1 1 µ0 I
BC,tot = B1,C,y + B2,C,y + B3,C,y = + + = + − =0
4πa 4πa 2πa 4 4 2 πa
So all three fields exactly cancel, and the field is precisely zero at point C. Which one could
have guessed from symmetry alone ...
N
Figure 8.2: (a) When a magnet is pushed through a coil
S V of wire, a voltage is induced in the coil. (b) When the
coil moves around the magnet, a voltage is also induced.
N Whether the loops of wire move or the magnet moves is
V
immaterial, a voltage is induced so long as the magnetic
S flux through the coil changes in time.
(a) (b)
Strictly speaking, it is not a current that is induced in the coil, but a voltage difference between
its end points. If the coil is part of a closed electric circuit, a current will flow, but a potential
difference will be induced even in a disconnected coil. If there is a voltage present, and the wire
is conducting, this means that an electrical current will be induced in any closed circuit when the
magnetic flux through a surface changes. Electromagnetic induction underlies the operation of
generators, induction motors, transformers, and most other electrical machines.
249
250 8.2 Faraday’s Law of Induction
The induced voltage is produced whenever there is relative motion of the coil and magnet, and it
was also discovered that the more loops of wire there are in the coil, the larger the induced voltage,
Fig. 8.3. Twice as many loops gives twice as much voltage, everything else remaining the same.
N
N
S
S
Figure 8.3: (a) A magnet pushed through a coil induces
V a voltage. (b) When the magnet is pushed through a coil
with twice as many loops, the induced voltage is twice as
V much.
(a) (b)
Finally, it was discovered that the induced voltage depends on how fast the magnetic field
through the coil changes - which in this simple example just means how fast the magnet moves
relative to the coil.
Induced Voltages:
The induced voltage in a coil is proportional to the number of loops, and the rate at
which the magnetic field through the loop changes.
where θBA is the angle between the surface normal and the magnetic field. The flux
through a closed surface bounding a volume is still zero.
Magnetic flux is the product of area of the loop and the perpendicular component of the magnetic
field through it, as shown in Fig. 8.4, and the induced voltage in the coil depends on the rate that
the flux changes with time. What this means is that either the field can be changing in time, or
the area facing the magnetic field can be changing in time, or both, and a voltage will be induced.
For example, we could either move the magnet back and forth into the loop, or rotate the coil in a
constant magnetic field.
loop of area A
B⊥ θ
B⊥ θ Figure 8.4: Magnetic flux through an area A. (a) A
θ ~ is incident on an area A at
uniform magnetic field B
an angle θ with the direction normal to the area. (b)
Edge-on view of the area. The flux through the loop is
ΦB = B⊥ A = BA cos θ
!
B !
B
(a) (b)
The result of all of this is Faraday’s law of electromagnetic induction, which relates the change
in magnetic flux through a loop per unit time to the induced voltage in the loop:
∆ΦB
∆V = −N (8.2)
∆t
This law covers all the basic phenomena we just discussed - the induced voltage depends on
the number of turns in the coil, and how fast the magnetic flux through the coil changes. What
about the minus sign though? What the minus sign says is that the induced voltage will try
to create a current that opposes the change in magnetic flux. If a current is induced in
the coil, it will circulate in such a way to try and stop the change in flux, by creating a magnetic
field of its own.
For a minute, think about what would happen if the minus sign weren’t there. In this case, a
time-varying flux would create a current in a loop of wire, which would create a field that changes
in the same way as the field causing the flux. This field would then add to the field causing the
flux, which would increase the current even more, and then further add to the original field. This
positive feedback would quickly run amok! Any infinitesimally small change in magnetic field with
time would get amplified, and cause a runaway current in the coil (at least until it melted). Since
this situation is clearly absurd, it makes some sense that the induced current must oppose the
change in flux, rather than add to it. It is precisely this negative feedback of coils which makes
them useful circuit elements, which we will come to in following sections.
Incidentally, this does not mean that the magnetic field created by the induced current is always
opposite that of the field causing the flux in the first place - it is trying to stop the change in flux,
not cancel the flux completely. For example, if the magnetic field causing the flux is increasing,
the induced current will create a field in the opposite direction to oppose the increasing flux, but
if the flux is decreasing, the induced current will create a field in the same direction to “shore up”
the flux and stop it from decreasing. This principle is known as Lenz’s law, and we will return to
8.3 Inductance
8.3.1 Mutual Inductance
As a concrete example, consider the two solenoids in Fig. 8.5. The top solenoid is powered with
~
a time-varying current I(t) = I0 cos ωt, which produces a time-varying magnetic field |B(t)| =
B0 cos ωt. This time-varying magnetic field creates a time-varying flux in the lower “pickup”
solenoid, which in turn leads to an induced voltage. The current, magnetic field, and induced
voltage all vary sinusoidally, though not all with the same phase as we shall see.
source coil
I I0 cos ωt
Figure 8.5: Mutual induction of two solenoids. The top
solenoid is powered with a time-varying current I(t) =
I0 cos ωt, which produces a time-varying magnetic field
~
|B(t)| = B0 cos ωt. This time-varying magnetic field cre-
ates a time-varying flux in the lower solenoid, which in
turn leads to an induced voltage.
V V0 cos ωt
V
“pickup” coil
What is the phase relationship between the current in the source coil and the voltage in the
“pickup” coil? First, we know that the magnetic field created by the source coil is just proportional
to the current in the coil, so it will be in phase with the current. When the current is at a maximum,
so is the magnetic field.
This in turn means that when the current and magnetic field are maximal, then the flux in the
pickup coil is maximal - only the magnet field is changing, the area is constant in this case. The
induced voltage in the pickup coil, however, depends on the time rate of change of the flux, not the
flux itself. When is the rate of change maximal? The time rate of change ∆ΦB /∆t is nothing more
than the slope of the flux versus time curve. Since the current and magnetic field are sinusoidal
in time, cos ωt, so too is the flux in the pickup coil. The maximum slope for a sinusoidal curve is
where it crosses zero on the y axis, and it has zero slope at peaks and troughs.
What this means is that ∆ΦB /∆t for the pickup coil is maximum whenever the field from the
source coil, and therefore the current in the source coil, is zero. Whenever the current in the source
coil is maximal, the induced voltage is zero, since ∆ΦB /∆t is zero. In short, the induced voltage
in the pickup coil is still sinusoidal, but a quarter cycle (90◦ ) out of phase with the current in the
source coil.
What this setup essentially does is wirelessly transmit power from the source to the pickup
coil through the time-varying magnetic field. This is known as mutual inductance, the basis for
electrical transformers. The key thing in designing a transformer is to somehow focus as much
of the flux from the source coil as possible and guide it into the pickup coil, such that as much
electrical energy from the source coil is transferred into the pickup coil as possible. One relatively
easy way to do this is to use a high permeability magnetic material to guide the flux, as in an
electromagnet (Section 7.4.2).
∆ΦB
∆V = −N (8.3)
∆t
We also know that the change in flux is just due to the current in the coil itself, so:
∆ΦB ∆I
∝ (8.4)
∆t ∆t
Combining these two facts, we see that the induced voltage in the coil ∆V must be proportional
to the change in current with time:
Self-inductance:
∆I ∆I
∆V ∝ or ∆V = −L (8.5)
∆t ∆t
We can also use our two proportionality equations, Eq. 8.3 and 8.4, to find an expression for L
itself:
∆ΦB N ΦB
L=N = (8.6)
∆I I
The unit of inductance L are volt-seconds per ampere [V·s/A], or henries [H].
So the inductance depends on the number of turns in the coil, the current in the coil, and
the flux of course. The fact that inductance depends on flux means that it is a function of the
coil geometry, and in general is difficult to calculate. For a simple solenoid, we know all of these
quantities, however, and one can show L = µ0 N 2 A/l = µ0 n2 V , where A, l, and V are the cross-
sectional area, length, and volume of the coil, respectively, and n = N/l as usual.
The inductance of a coil tells us how dramatically a coil responds to changes in its own current.
From Lenz’s law, we know that the induced voltage in the coil will try to stop any changes its
flux, which means opposing changes in current in the coil itself. Inductance is therefore a sort of
a “resistance to change in current,” which makes inductors are useful in circuits with time-varying
signals. This is nothing more than the “negative feedback” implied by Lenz’s law we referred to
above, and the negative feedback of inductors can be used in, e.g., audio amplifiers and many other
circuits to “smooth out” rapid changes or fluctuations in signals, as we will explore further below.
The reluctance of a coil of wire to change current rapidly due to its self inductance can actually be
a useful thing in an electronic circuit. Undesired rapid changes in current (due to a power spike,
for example) can be smoothed out by putting an inductive element in a circuit, and some GFI
outlets are based on this idea. Filters for high-frequency filters (such as those in audio amplifiers)
can be built from inductors due to their reluctance to allow rapidly-varying currents through them
– a rapidly-varying current is just another description of a high-frequency signal. Combined with
capacitors, which like to let rapidly-varying signals through but not slowly-varying (or dc) signals,
one can tailor the frequency or time response of all sorts of circuits. A circuit element used primarily
for its self-inductance is simply called an inductor
Owing to the fact that inductors also store energy in their magnetic field, which we will discuss
in the following section, inductors can also be used to temporarily store energy, just like capacitors.
In fact, capacitors and inductors are closely linked conceptually:
This rule of thumb will become more clear when we discuss the inductive version of the RC
circuit, the RL circuit.
8.3.2.2 RL Circuits
Before we discuss circuits with inductors in them, we should first think about what role inductance
might play in some of the simple circuits we have constructed so far. Consider the simple resistive
circuit in Fig. 8.6a. This is a circuit we have seen many times before. We know that once the
switch S is closed, there will be a current of I = ∆V /R in the resistor, so the voltmeter across the
resistor will read ∆V (provided the wires have negligible resistance). Now, what happens just after
we close the switch S? Is there immediately a current I = ∆V /R in the resistor, or does it take a
little while to build up? In either case, why?
B
I
S S
(a) (b)
First, remember that any working electric circuit has to create a closed loop – current has to go
from the source, around a circuit, and back into the source. Next, remember from the preceeding
section that any closed loop of wire has a self inductance L, which tends to resist rapid changes
in current. Putting this together, the closed loop of the circuit itself acts as an inductor, and tries
to resist changes in current. Since we have to have a closed loop to make any kind of circuit, this
means that all of our other circuits already behave as if they have inductors present!
As soon as the switch S is thrown, I doesn’t immediately change from 0 to ∆V /R. The current
~ circulating around the wires in the circuit,
beginning to flow in the circuit creates a magnetic field B
which in turn increases the flux inside the loop, Fig. 8.6b. Eventually, a steady-state is reached,
and the current is constant. The constant current will lead to a voltage drop across the resistor of
∆VR = IR. The voltage drop across the resistor represents an opposition to the current – the larger
the current, the larger the voltage drop across the resistor.
Now, for our first real inductive circuit, an inductor L connected in series with a voltage source
∆V , Fig. 8.7. What happens when we close the switch S? At the instant the switch is thrown, the
current tries to flow into the inductor. We know from our discussion of self inductance that when
the current is changing in time, a voltage is induced in our inductor. Using the loop rule, which
says that the sum of voltage drops and sources around a closed loop must sum to zero, we can
∆I
∆VL = −L (8.7)
∆t
This looks a lot like the voltage drop across a resistor, and by analogy we interpret the
inductance L as an opposition to the change in current. The faster the current changes,
the larger ∆I/∆t, and the larger the voltage built up in the inductor – the faster you try to change
the current in an inductor, the more readily it “soaks up” the available voltage to confound your
efforts. For this reason, inductors can be useful for preventing rapid surges in current. Connecting
a point in a circuit to ground via an inductor effectively shunts away rapid current variations to
protect sensitive equipment.
We are finally ready for a more useful circuit, the RL series circuit shown in Fig. 8.8. Suppose
the switch is closed at t = 0. As soon as this happens, the current begins to increase, but the
inductor tries to prevent it from increasing too quickly – the maximum voltage drop across the
inductor occurs when the current is changing most rapidly, right when the switch is closed. By
“stealing” as much of the voltage from the source as possible, the inductor prevents the resistor
from taking part of the voltage drop and thereby inhibits current from flowing.
R
L Figure 8.8: An “RL” circuit. The inductor prevents the
∆V current from changing too quickly in the circuit - the cur-
V
rent through an inductor behaves like the voltage across
a capacitor.
As the current approaches its steady-state value, the changes in current become less and less,
and the inductor has a smaller and smaller voltage drop. When the current is finally stabilized at
a constant value, an ideal inductor actually has no voltage drop, since the current isn’t changing
at all, ∆I/∆t = 0. This is reminiscent of our RC circuits of Sect. 6.6. Only while the capacitor was
charging or discharging did a current flow, not in the steady state. Inductors behave toward voltage
as capacitors behave toward current – a voltage only develops across an ideal inductor when the
current is changing just like a current only flows in a capacitor when the voltage is changing.
In the case of RC circuits, we found that the time it took to charge or discharge the circuit
depended on a time constant τ = RC. In the case of an RL circuit, we can define a similar time
constant which gives the time required for the voltage to get within 1/ei of its steady-state value:
L
τ= (8.8)
R
This gives τ in seconds [s] when R is in Ohms [Ω] and L is in Henries [H].
The equation for the current as a function of time for an RL circuit is also just like the voltage as
a function of time for an RC circuit:
∆V
I(t) = 1 − e−t/τ (8.9)
R
Just like a capacitor takes time to charge up, an inductor takes time to let a current flow. The
larger the inductance L, the longer it takes for the current to reach its steady state value, just
like varying C in an RC circuit. In contrast to the RC case, however, increasing R decreases the
waiting time. In the RL circuit, a larger resistance is able to “steal” more of the voltage from the
inductor, lessening its ability to impede current flow. In the RC case, increasing the resistance
also “steals” more voltage from the source, which leaves a smaller voltage available to charge the
capacitor – hence it takes longer.
We can even take the analogy between inductors and capacitors one step further. Capacitors
store electrical energy by separating charges. The induced voltage across an inductor prevents the
voltage source from immediately producing a current, which means that the source must do work
to achieve current flow. If the source must do work against the inductor, then there must be some
source of stored energy inside the inductor. As it turns out, the presence of a magnetic field in
the inductor is the source of energy, just like the presence of the electric field between the plates
of a capacitor is a source of energy. Following the same derivation as Sect. 4.6.2, we can relate the
potential energy stored in an inductor to the current in the inductor:
1
P E = LI 2 (8.10)
2
i
Here again we mean e the base of the natural logarithms, not e the unit of charge.
Again, notice that if you replace L with C and I with V , you have exactly the expression for
potential energy stored in a capacitor. Now we can make our glib rule of thumb even more succinct:
8.4 Transformers
The dual coil setup Section 8.3.1 is the most basic form of a transformer. If our source solenoid has
N1 turns, and is powered by a voltage ∆V1 , then the magnetic field created by it is proportional
to ∆V1 N1 , since I and ∆V are proportional by Ohm’s law, and B, I, and N1 are proportional.
Faraday’s law tells us that the induced voltage in the pickup solenoid ∆V2 is proportional to the
rate of change of that field, as well as the number of turns in the pickup coil N2 :
∆ΦB1
∆V2 = −N2 (8.11)
∆t
On the other hand, we now know that we can relate the change in ΦB1 to the voltage in coil 1
through its self inductance (Eq. 8.6):
∆ΦB1
∆V1 = −N1 (8.12)
∆t
Voltage relationship between source (1) and pickup (2) coils in a transformer:
N2
∆V2 = ∆V1 (8.13)
N1
here N1(2) is the number of turns in the source (pickup) coil, and V1(2) is the voltage on
the source (pickup) coil.
What this tells us is that when N2 is greater than N1 the pickup coil voltage is actually larger
than that of the source coil, and we call this configuration a “step-up” transformer. Step-up
transformers take a given time-varying voltage, and amplify it by a factor N2 /N1 . When N2 is
smaller than N1 we have a “step-down” transformer, which takes a given time-varying voltage and
reduces it by a factor N2 /N1 .
Of course, there is no free lunch, and we can’t get power from nowhere. The total power input
to the source coil has to equal the total power at the pickup coil, or
This also gives us the relationship between the currents in the source and pickup coil:
Current relationship between source (1) and pickup (2) coils in a transformer:
N1
I2 = I1 (8.15)
N2
here N1(2) is the number of turns in the source (pickup) coil, and I1(2) is the current in
the source (pickup) coil.
This tells us that if we step up the voltage, we have to step down the current, and vice versa, in
order to conserve energy.
+X v
X X X
+
X
Bin X X
- X X
X X X X X
-
-
~ directed
This charge imbalance has to give rise to a uniform electric field inside the conductor, E,
downward. Of course, the presence of this charge imbalance and electric field also means the
~ − q|~
ΣF = q|E| ~ =0
v||B| (8.16)
When the forces balance, we have equilibrium, and |E| ~ = q|~ ~ A uniform electric field E
v||B|. ~
over the length l of the bar is nothing more than a potential difference, ∆V = El. Putting this all
together, the movement of the conducting bar in a magnetic field leads to a potential difference
across the length of the bar:
∆V = |~ ~ = |E|l
v||B|l ~ (8.17)
By itself, this is not so useful, but we can make the moving bar part of an electric circuit, as
shown in Fig. 8.10a. The bar now slides on conducting rails, and the motional voltage produced in
the bar induces a current in the rails. An equivalent circuit is shown in Fig. 8.10b.
Iind
X X X X X X
X
Bin X X X X X v X
X X X X X
(b) R |ΔV| = B l v
-
In which direction is the induced current, and how big is it? The flux through the closed loop
~ in |Aloop , where Aloop is the area of the loop.
defined by the rails and the moving bar is just ΦB = |B
The area of the loop is changing with time, of course, since the bar is moving, but the magnetic
field is not. We can easily write down the magnitude of the induced voltage, ∆V , which along with
Ohm’s law will give the current I = ∆/R:
∆ΦB ∆A
∆V = − = −B (8.18)
∆t ∆t
The movement of the bar at a constant velocity ~ v implies that it covers a distance ∆x in a time
∆t. The area of the loop is just l∆x at any particular time, so the rate of change of the area can be
found easily:
∆A l∆x
∆V = −B = −B = −Blv (8.19)
∆t ∆t
As we might have expected, the induced voltage ∆V , and the current I = ∆V /R depend on how
fast the bar moves, how big the field is, and how long the bar is. Further, we see how a constant
magnetic field can still give rise to a time-varying magnetic flux - if the field is constant, we have
to change the area for there to be an induced voltage. This sort of voltage is called a “motional
voltage,” or ”motional EMF” since it results from a conductor moving in a magnetic field.ii
But. What about the direction of the current? When the bar moves to the right, due to some
external force ~Fappl , the flux is increasing with time. The induced current wants to oppose the
change in flux, which in this case means it wants to slow the motion of the bar. This is consistent
with the magnetic force on the bar being to the left, opposing the external force. The induced
current wants to stop the increase in flux, so it will circulate in a direction that opposes the
constant field in the loop, i.e., counterclockwise.
What if we reverse the direction of the bar’s velocity, as shown in Fig. 8.11b? If the bar were
moving to the left instead, the flux would be decreasing. The induced current would circulate in
such a way to stop this decrease - it would try to increase the flux in the loop, and would therefore
circulate clockwise. Induction always acts in such a way to reduce ∆Φ/∆t, whether this means
increasing Φ or decreasing it.
The most important thing in either case is that the magnetic force and induced current are
opposing the motion of the bar, which is causing the change in flux in the first place. And, we have
a nice symmetry between electricity and magnetism now.
In the end, there is nothing unique about our conducting bar from the previous section. Any time
we have a moving conductor intersecting a magnetic field, or vice-versa, there is an induced current
and a retarding force. The relative motion of a conductor and a magnetic field causes a circulating
current within conductor. These induced currents are also known as “eddy currents,” since they
are somewhat analogous to the swirling currents created when you move an oar through the water,
for instance.
As we known, these induced eddies of current create magnetic fields that oppose the change in
flux through their diameter. As a concrete example, consider the pendulum in Fig. 8.12a, which
ii
EMF stands for “electromotive force,” a somewhat antiquated term for a source of voltage, originating from
earlier times when physicists did not make a hard distinction between force and energy. We have avoided this term
wherever possible to avoid confusion, as substituting “voltage” changes no essential physics.
Iind
X X X X X X
X
Bin X X X X X v X
X X X X X X
(a) X X X X X X X
X X X X X Fm
X X X X X X X
consists of a conducting plate swinging through a region of constant magnetic field, perpendicular
to the plane of the pendulum’s motion. If we pull the plate back to an angle θ and release it, we
once again have a moving conductor in a magnetic field, Fig. 8.12b, and we expect again a retarding
force.
(a) (b)
As the conducting pendulum moves through the magnetic field, circulating currents form, which
generate a magnetic field opposing the change in flux through the conducting plate. The only way
to change the flux through the plate in this case is to slow the pendulum down, so induction results
in a strong braking force on the pendulum. In contrast, a non-conducting pendulum will experience
no additional force.
In short, a conducting pendulum in a (perpendicular) magnetic field will be dramatically slowed
and stopped, hence the name “eddy current brake.” If the field is sufficiently strong, the pendulum
will not even complete one cycle, which (hopefully) you have seen demonstrated in class. The
stronger the magnetic field the pendulum moves through, or greater the electrical conductivity of
the conductor, the greater the currents developed and the greater the opposing force.
Eddy current brakes can be quite useful, and are actually used in the braking mechanism of
X X X X X X X X X X X X
some (metallic) train wheels. One advantage is that the eddy current braking effect is stronger
when the wheels spin faster, so as the train slows, the braking gradually lets up on its own, and
produces a smooth stopping motion.
Eddy currents are also useful for traffic detection systems, detection of coins in vending machines,
and metal detectors – in all three of these cases, one can make use of the induced currents and
forces when conductors move through magnetic fields. Can you imagine how eddy currents could
be used in each case?
Repeat the experiment with a piece of plastic (e.g., PVC) pipe. Now there should be no
difference. Why?
8.6 Generators
We found in Chapter 7 that we could make a simple electric motor by utilizing the torque on a
current loop in a magnetic field. Electromagnetic induction allows the opposite, a generator – we
can create an electric current by spinning a loop of wire in a magnetic field. In fact, motors and
generators rely on the same underlying principles: moving charges experience a force perpendicular
to their motion and the magnetic field present. A motor is more or less a generator run in reverse,
and vice versa.
Figure 8.14a-c illustrates the basic operation of an electrical generator, which is nothing more
than a device to convert mechanical energy into electrical energy. A loop of wire is rotated at
constant angular velocity inside a permanent magnet. As the loop rotates, the area it exposes to
the magnetic field changes, Fig. 8.14b. Remember magnetic flux is the product of area of the loop,
(a)
N S
Figure 8.14: Basis of an electric generator. (a) A loop
of wire is rotated inside a permanent magnet. The me-
A chanical input to rotate the loop of wire is converted into
electrical energy through induction – voltage is induced in
the loop as it rotates. Mechanical input can be supplied
(b) by, e.g., steam or falling water. (b) As the loop rotates,
N S the area of the loop perpendicular to the magnetic field
changes with respect to the magnetic field. Since mag-
netic flux is the product of the area perpendicular to the
A cos θ field, this varies from a maximum when the loop is verti-
cal to a minimum of zero when the loop is horizontal as
the loop is rotated. (c) This results in an induced volt-
(c) voltage age which varies sinusoidally with time when the loop is
rotated at constant angular velocity. One complete rev-
olution of the loop corresponds to one complete cycle of
time voltage and current.
and the perpendicular component of the magnetic field through. In this case, the field does not
change, but the area exposed to the field does. When the loop is lying parallel to the magnetic
field, there is no flux, and when the loop is perfectly flush with the magnet pole faces the flux is
maximal.
The rotation of the loop then creates a time-changing magnetic flux through the loop, which
varies from maximum to zero and back to maximum. This results in an induced voltage which
varies sinusoidally when the loop is rotated at constant angular velocity. One complete revolution
of the loop corresponds to one complete cycle of voltage and current, as shown in Fig. 8.14c. Since
the current and voltage in the loop varies in time, we call this “alternating current,” which we will
cover in slightly more depth in the next chapter.
1. A magnetic field of 0.3 T is directed perpendicular to the plane of a circular loop of wire of
radius 25 cm. Find the magnetic flux through the area enclosed by this loop.
−2
2.3 × 10 T
7.1 × 10−3 T·m2
4.8 × 10−1 T·m2
5.9 × 10−2 T·m2
2. A magnet and a non-magnet of the same mass are dropped into copper tubes of equal length.
Which takes longer to come out?
The magnet.
The non-magnet.
It takes the same amount of time.
a b
3. A flat metal plate swings at the end of a bar as a
pendulum, as shown. When the pendulum is at po-
X X X X sition a, what are the directions of the induced cur-
X X X X v rents and (magnetic) force on the bar, respectively?
X X
v X
X X X X Counterclockwise; to the left
X X X X Bin Clockwise; to the left
Counterclockwise; to the right
Clockwise; to the right
5. The magnetic flux through a loop can change due to a change in:
The area of the coil
X X X X X X
Counterclockwise; to the left
X X X X X X X
Clockwise; to the left
Counterclockwise; to the right
Clockwise; to the right
8.9 Problems
Is it possible to have a magnet strong enough (or a tube conductive enough, etc) that it would
actually stop inside the tube? Explain.
6. A circular coil enclosing an area of 105 cm2 is made of 200 turns of copper wire. The wire
making up the coil has resistance of 7.0 Ω, and the ends of the wire are connected to form a
closed circuit. Initially, a 2.0 T uniform magnetic field points perpendicularly upward through
the plane of the coil. The direction of the field then reverses so that the final magnetic field has
a magnitude of 2.0 T and points downward through the coil. If the time required for the field
to reverse directions is 0.15 s, what average current flows through the coil during that time?
X X X X X X
7. An aluminum ring of radius 5.0 cm and resistance 1.0×10−4 Ω is placed around the top of a
long air-core solenoid with n = 996 turns per meter and a smaller radius of 3.0 cm. If the current
in the solenoid is increasing at a constant rate of 266 A/s, what is the induced current in the ring?
Assume that the magnetic field produced by the solenoid over the area at the end of the solenoid
is one-half as strong as the field at the center of the solenoid. Assume also that the solenoid
produces a negligible field outside its cross-sectional area.
1. 5.9 × 10−2 T·m2 . The area of the circular loop is πr2 , where r = 0.25 m is the radius of the
loop (note that we changed it to meters!). The flux is then just ΦB = BA - since the field is
perpendicular to the area of the loop, θ = 0 and cos θ = 1.
2. The magnet. The magnet induces eddy currents in the copper tube, which create an
opposing magnetic field that slows the magnet.
In position b, as the pendulum swings out of the region of magnetic field, the flux is decreasing.
The induced currents will try to increase the flux, and will create a field acting in the same
direction as the external field. This implies a clock-wise current.
Again, the force will act in the direction opposite the velocity, to try to slow the change in flux.
4. a.
5. All of the above. Magnetic flux is ΦB = BA cos θBA , where θBA is the angle between the
loop area’s normal and the magnetic field. A change in magnetic field or area clearly changes
the flux ΦB , as does changing the orientation of the loop in the field, which changes θBA .
6. Counterclockwise; to the left. When the conducting rod moves to the right, this serves
to increase the flux as time passes (A increases while B stays constant), so any induced current
wants to stop this change and decrease the flux. Therefore, the induced current will act in such
a way to oppose the external field (i.e., the field due to the induced current will be opposite to
the external field). This must be a counterclockwise current. Consistent with decreasing the
overall flux, the force ont he bar must be to the left, attempting to impede the bar’s progress
and reduce the change in flux.
1. 1.35 m/s. The presence of a magnetic field perpendicular to the movement of ions in the
blood means that they experience a magnetic force.
Fm = qvB
The force for positive ions is pointing up, and for negative ions it is pointing down. This serves
to separate spatially the positive and negative charges - positive charges move toward electrode
A, and negative charges move to point B. This continues until the ions reach the surface of the
artery, at which point they are separated by the diameter of the artery d.
If the charges are separated spatially by a distance d, then this gives rise to an electric field E,
and a potential difference ∆V = Ed. At this point, the electric and magnetic forces are balanced.
Set the magnetic and electric fields equal to one another, use the expression for ∆V , and take
care with units:
∆V
Fm = qvB = Fe = qE = q
d
q∆V ∆V
⇒v = = = 1.35 m/s
qBd Bd
You can verify that the sign of the potential difference does not depend on whether the ions
are mostly positive or negative, since the force is in different directions for each ion. No matter
what, positive ions go to point A, and negative ions go to point B, and the same potential
difference results.
2. 2.25 A, 0.51 N, to the left. When the axle moves to the left, this serves to decrease the
flux as time passes, so any induced current wants to stop this change and increase the flux.
Therefore, the induced current will act in such a way to reinforce the external field (i.e., the
field due to the induced current will be in the same direction as the external field). This must
be a clockwise current.
To find the current, we only need to use motional emf. The axle is just a bar of length l moving
at velocity v in a magnetic field B. This gives us a voltage ∆V , and Ohm’s law gives us I:
E = Blv
∆V
I = = 2.25 A
R
The magnitude of the force is now found readily - the axle is just a wire carrying a current I in
a magnetic field B:
|~
F| = BIl = 0.51 N
What is the direction? For one, if the axle is traveling at constant velocity, then the exter-
nal force must balance the magnetic force to give zero net force. The magnetic force must be
pointing to the right (using the right hand rule), so the external force must be pointing to the left.
Alternatively, we note that the external force is what is pushing the axle in the first place! So
it has to be in the same direction as ~
v, namely, to the left.
3. 0.5 A, 2.0 W. The first part is exactly like the previous problem.
E = Blv = IR
The problem is now that we don’t know B. We do know that the external and magnetic forces
must balance for the rod to have a constant velocity.
Fapp
Fm = BIl = Fapp ⇒ B=
Il
Plug that into the first equation:
Alternatively, you note that power delivered by a force is PF = F v cos θ, where θ is the angle
between the force and velocity. In this case, θ = 0, so PF = F v = 2 W.
Super sneaky way to do everything at once: recall in the first place that the power
supplied by the force must equal the power dissipated in the resistor: PF = I 2 R = F v. You
know F , and v, so you can calculate the power, and you also know R, so just solve for I.
4. No. The eddy current braking comes from induced currents in the copper tube due to the
falling magnet. The falling magnet represents a time-varying B field, which creates a time-
varying flux through the copper tube. If the magnet actually stopped, there would be no eddy
currents at all, and nothing to hold the magnet against gravity.
Once the magnet stops, the very force slowing it down ceases to exist. The flux in the tube is
changing only because the magnet has some non-zero velocity. No emf, and therefore no eddy
currents result from a stationary magnet giving a constant flux through the tube.
Putting it another way: the force is due to the relative velocity of the magnet and the charges
in the copper. The magnetic force is F = qvB, where v is the relative velocity o f the tube and
magnet. If v = 0, there is no force - so if the magnet could actually be stopped, the force holding
it up would go to zero, and it would fall again! Clearly, the answer is no.
5. Q = πCa2 K, upper
6. First, we know that the changing magnetic field through the coil will give an induced voltage
via Faraday’s law. From the induced voltage, and the given resistance, we can find the current
using Ohm’s law. First, since the coil geometry is fixed, the voltage induced in a coil of N turns
of wire of area A due to a changing B field is just:
∆ΦB ∆B
∆V = −N = NA
∆T ∆t
So the first question is: what is ∆B/∆t? The field changes from +2 to −2 T in 0.15 s, so we just
calculate it. You did take care that the field changes sign, right?
∆B 2 T − (−2 T) 4
= = T/s
∆t 0.15 s 0.15
Now that we have that, we note that Ohm’s law gives us the current from the induced voltage
and resistance. You of course remembered to change the area to square meters, not centimeters.
∆V NA ∆B
I = − =
R R ∆t
200 · 0.0105 m2
4T
= −
7Ω 0.15 s
= −8 A
If you used all SI units, your answer is in amps. In order to verify this, you need to ‘recall’ that
1 T·m2 = 1 V·s= 1,Wb, and 1 Ω = 1 V/A.
7. Clearly, this is a magnetic induction problem. We know that a current through the soleniod
produces a magnetic field, therefore a time-varying current creates a time-varying magnetic
field. This time-varying magnetic field is felt by the aluminum ring, and by Faraday’s law, a
voltage is induced. First, we can write down Faraday’s law for the voltage induced around the
aluminum ring due to the time-varying field of the solenoid, noting that the ring only picks up
one half of the solenoid’s field:
∆Φring
∆Vring = Nring
∆t
∆ 12 Bsol · Aring
=
∆t
1 ∆Bsol
= Aring
2 ∆t
Here we made use of the fact that the area of the aluminum ring is a constant, and being a solid
ring, it has only one turn (Nring = 1). How do we find ∆∆
Bsol
t
? We know the rate that the current
∆ Isol
through the solenoid changes, ∆t , so all we need is the relation between current and B field
for a solenoid:
Bsol = µ0 nI
∆Bsol ∆ (µ0 nsol I)
=⇒ =
∆t ∆t
∆Bsol ∆Isol
= µ0 nsol
∆t ∆t
Here nsol is the number of turns per unit length of the solenoid. Now we can plug this into our
expression for ∆Vring , and we are nearly done:
1 ∆Bsol
∆Vring = Aring
2 ∆t
1 ∆Isol
= Aring µ0 nsol
2 ∆t
The only thing left is to relate the current in the ring to the induced voltage, via Ohm’s law,
and plug in the numbers.
∆Vring
Iring =
Rring
µ0 Aring nsol ∆Isol
=
2Rring ∆t
2
4π × 10−7 T · m/A · π (0.03 m) · 996 m−1
= (266 A/s)
2 · 1.0×10−4 Ω
2 −1
T·m·m ·m ·A
= 4.7
Ω·s·A
T · m2
= 4.7 ·A
V·s
Iring = 4.7 A
A LTERNATING current (ac) is nothing more than current that varies (sinusoidally) in time,
and in Section 8.6 we learned how to produce alternating current with a simple rotating loop
generator. As it turns out, nearly all appliances around us run on alternating current - the “wall
current” you get from an outlet is alternating current at a frequency of 60 Hz. Not only is ac current
important for everyday life, even simple circuits behave differently when powered by time-varying
currents and voltages.
In this chapter, we will very briefly discuss ac circuits, and move on to electromagnetic waves,
which will lead the way to optics and modern physics.
here ∆V (t) is the voltage at any instant in time t, ∆Vmax is the peak voltage, ω is the angular
frequency, and f the frequency in Hz. The magnitude of an ac not only varies with time, it actually
changes sign as well.
Circuit diagram symbol an ac voltage source: ∼
What happens when we connect components to an ac source? As the simplest example, we will
just hook up a single resistor to an ac voltage source, as shown in Fig. 9.1a. We know the voltage
across the resistor varies according to Eq. 9.1. Just because the voltage is changing doesn’t mean
that Ohm’s law is not valid, however, so we can immediately find the current through the resistor
as a function of time as well:
∆V (t) ∆Vmax
IR (t) = = sin 2πf t (9.2)
R R
Well, big deal. The voltage goes up and down, and so does the current. This should not be a
surprise. The power in the resistor is more interesting though. We can readily calculate that from
Eq. 5.27:
274
9.1 Resistors in an ac Circuit 275
(a)
V ∝ V0 sin ωt R
I
VI
∆Vmax
PR (t) = I∆V = sin 2πf t · ∆Vmax sin 2πf t (9.3)
R
2
∆Vmax
= sin2 2πf t (9.4)
R
The power dissipated in the resistor also varies with time. Moreover, its period is only half that
of the current and voltage. Since the power dissipation in the resistor depend on the product of
voltage and current (or the square of voltage or current), it doesn’t matter if the voltage and current
are negative, their product is always positive.
Even more interestingly, the power dissipated is actually zero whenever the current and voltage
go through zero. While the average voltage or current over any integer number of periods is zero,
the average power is not. Further, the dissipation produced by a sinusoidal voltage is not the same
as just applying a constant dc voltage of ∆Vmax , since the alternating voltage is only at its maximum
value for an instant. In ac circuits, it is common to use a special kind of average, the root mean
square or rms.
The rms average of a collection of n numbers x1 , x2 , · · · xn is defined like this:
v
u n r
u1 X x21 + x22 + · · · + x2n
xrms =t x2i = (9.5)
n n
i=1
Basically, the rms average takes the average of the squares of the numbers, and then takes the
square root of that. The rms average is useful when dealing with periodic functions, since it does
not average to zero over a full cycle, but gives a sort of averaged amplitude independent of whether
the function changes sign. We can find the rms value of the current, voltage, and power with a bit
of algebra, but it is tedious. We will merely quote the results:
∆Vmax
VR,rms = √ ≈ 0.707Vmax (9.6)
2
∆Imax
IR,rms = √ ≈ 0.707Imax (9.7)
2
PR,av = Irms
2
R (9.8)
The average power is just calculated from the rms current or voltage as you would expect. To be
concrete: this means that an alternating current of 5 A produces the same dissipation in a resistor
√
as a dc current of 5/ 2 A, about 30% less. The rms voltage, current, and resistance obey Ohm’s
law just as the maximum values do:
With these relationships, we can also relate the power dissipated to the maximum current, just
for completeness:
1 2
Pav = Irms
2
R = Imax R (9.11)
2
When you plug an electrical device into the wall, you are connecting it to an ac voltage source.
Normal power in the US uses an rms voltage of 120 V, which means that the actual peak voltage at
√
the wall outlet is 120 · 2 V, or about 170 V. It is typically rms values of current and voltage that
are quoted for ac circuits, and for the remainder of the chapter, that is what we will quote. One
√
can easily convert between rms and maximum values if desired – it is just a factor 2 in the end.
a short time. A capacitor therefore restricts current flow to very short time intervals, depending
on its time constant τ = RC.
If we connect a single capacitor to an ac voltage source, Fig. 9.2, what will happen? At t = 0
on the graph, the voltage (blue curve) starts from zero and quickly increases. Ramping up the
voltage on the capacitor means that a large current will flow (black curve), attempting to charge
the capacitor. The faster the voltage increases – the larger the slope of the V (t) curve – the larger
the current will be. When V (t) reaches its plateau one quarter of the way through the cycle, the
voltage is nearly constant, and no current flows through the capacitor. Shortly thereafter, the
voltage decreases, and the capacitor responds by discharging, again at a rate proportional to the
slope of the V (t) curve. Once the voltage changes sign, the capacitor begins charging up again with
the opposite polarity, and the whole cycle repeats itself. What is important to realize is that in
ac circuits, current does flow through capacitors – it is just like the RC circuits we studied earlier,
except that now we are effectively turning the voltage on and off continuously.
(a)
V ∝ V0 sin ωt C
VI V
The current on the capacitor reaches its maximum positive and negative values whenever the
voltage is zero. Similarly, the current goes to zero whenever the voltage is at a maximum, as at
those points the voltage is momentarily essentially constant. In the end, this leads to the current
through the capacitor also being sinusoidal, but with a quarter cycle (90◦ ) phase shift. The usual
way of stating this is that the “voltage lags the current by 90◦ ,” a reference to the fact that the
current reaches its maximum a quarter cycle after the voltage does. More mathematically, the
current response has a +90◦ phase shift with respect to the driving voltage.
How much current flows through the capacitor? We can qualitatively figure out what it depends
on already. As the voltage varies, the capacitor only allows the most current to pass when the
voltage is changing the most rapidly. We expect, then, that the current goes up as the frequency
of the voltage goes up and the voltage changes faster and faster. As the capacitance gets larger,
more and more charge is required, so we should also expect larger current for a larger capacitor.
Making this quantitative involves generalizing Ohm’s law to ac circuits. For resistive elements,
this is not necessary, Ohm’s law works just fine. What we need is a way to relate current and
voltage for reactive elements, like capacitors and inductors, that react to changes in current and
voltage. Instead of resistance, reactive elements like capacitors and inductors have what is called a
reactance X:
where X is the reactance of the circuit element. Capacitors and inductors are reactive
elements, resistors are not.
For capacitors, the reactance has just about the form we would expect: inversely proportional
to frequency and capacitance:
1
XC = (9.13)
2πf C
As the frequency of the voltage increases, the reactance decreases, and the current increases.
Similarly, as capacitance increases, the current increases:
∆Vrms
Irms = = 2πf C∆Vrms (9.14)
XC
What about the power in a capacitive ac circuit? Figure 9.2 shows the current, voltage, and
power for a capacitor connected to an ac voltage source. Since the voltage and current are now 90◦
out of phase, the maximum power now occurs halfway between the maximum current and voltage.
Further, now the power can become negative. What does that mean? Simple, During the charging
cycle, the power is positive as it is for a resistor, meaning that the source is supplying energy to the
capacitor. During the discharging cycle, the capacitor is pushing charges back to the source, and
effectively, the source is draining energy from the capacitor. Charge gets pushed back and forth
between the source and capacitor, and the power swings from positive to negative. We get nothing
for free, however – during the discharge cycle, the capacitor is just pushing back the charges it
stored during the charging cycle, energy is still conserved. In this light, a capacitor is useful as
either temporary energy storage device, or as a way of generating a time-delayed response.
XL = 2πf L (9.15)
Now consider an inductor L connected to an ac voltage source, Fig. 9.3. It is a bit easier to
begin describing the inductor’s behavior starting at one quarter cycle, when the voltage is at its
maximum. At this point, the voltage is momentarily constant, and so is the current in the inductor,
so it offers no resistance. As the voltage begins to decrease, its time variation (slope) increases,
and the inductor offers more and more resistance to current flow. The voltage across the inductor
is opposite that of the source, and it tries to push current back into the source. When the change in
voltage with time is maximum, when the voltage crosses zero, the inductor is pushing a maximum
current back to the source.
(a)
V ∝ V0 sin ωt L
VI V
As the source voltage becomes negative and its variation slows, the inductor current decreases,
and when the voltage reaches its minimum, the inductor current is zero. Just like in a capacitor,
the maximum current and voltage are one quarter cycle apart, but now the current is increasing
ahead of the voltage, and we say that the “voltage across the inductor leads the current by 90◦ .”
Again, more mathematically, the current response has a −90◦ phase shift with respect to the driving
voltage.
The power in an inductive circuit behaves similarly to that in a capacitive circuit – for half of
the cycle, while the voltage is decreasing, the inductor is absorbing energy from the source and
storing it in its magnetic field, and for the other half, while the voltage is increasing, it is pushing
energy back to the source (Fig. 9.3b).
9.4 Filters
What neat things can we do with ac circuits? Already, we know enough to build simple signal
filters. Consider the circuit in Fig. 9.4, which we have drawn in a manner closer to what electrical
engineers typically use. A voltage Vin is sent in from the left, defined relative to ground. That
is, a voltage Vin is applied between a “positive” signal wire (the upper wire), and a ground wire.
A resistor R connects the signal wire to the output Vout , and a capacitor connects the input to
ground. Basically, the resistor and capacitor are in series, and the output voltage is taken across
the capacitor. What happens in this circuit?
low-pass
R
Vin Vout
C
Figure 9.4: An RC low-pass filter. Capacitors present
a low reactance to high frequency signals, so they are se-
lectively returned to ground before the output.
Vout
Resistors present an equal resistance to signals of any frequency, but capacitors present a lower
reactance to high frequency signals. High-frequency signals entering from the input see the capacitor
as a low reactance path to ground, and thus most of the high-frequency signal takes this path to the
ground and never reaches the output. Low-frequency signals see the capacitor as a high reactance
and avoid this path, so most of the low-frequency signal reaches the output. What this circuit
really does is selectively filter out the high-frequency portions of a mixed frequency signal, and let
the low frequency signals pass through – the ratio between the input voltage Vin and the output
Vout depends on frequency. For this reason, this circuit is known as a “low-pass” filter. A circuit
like this could be used to direct the low-frequency portions of an audio signal to a “woofer” speaker,
for instance. The frequency response of this type of filter is also shown in Fig. 9.4.
Why is the resistor there, and what is the range of filtered frequencies? What this circuit can
also be thought of is a generalization of the resistive voltage divider (series resistors), where the
voltage division factor depends on frequency. When the reactance of the capacitor is equal to the
resistance, half of the input power goes through the resistor, and half through the capacitor. Since
power goes as voltage squared, when the reactance equals the resistance the output will be reduced
√
by a factor 1/ 2 relative to the input, about 70%. The reactance and resistance will be equal at
one particular frequency, the cutoff frequency, whose value is given by:
1
XC = =R (9.16)
2πfcutoff C
1 1
=⇒ 2πfcutoff = = (9.17)
RC τ
Vout 1
and = √ at the cutoff frequency (9.18)
Vin 2
It should not be too surprising at this point that the cutoff frequency is just the same as one
over the time constant. After all, this is precisely the same RC circuit we studied earlier, just
viewed in the frequency domain instead of the time domain!
What about the circuit in Fig. 9.5? In this case, an inductor and resistor are in series. The
inductor presents a low reactance to low-frequency signals, and they will be preferentially sent to
ground before reaching the output. High-frequency signals will avoid the inductor, and pass easily
to the output. Thus, this circuit is a “high-pass” filter, selectively filtering out the low-frequency
portions of a mixed frequency signal, and letting high-frequency portions pass through. A high pass
filter like this one could be used to send high frequency audio signals to a “tweeter” speaker, and
block lower frequency bass signals that may damage it. More complicated (but still recognizable)
high-pass and low-pass filters are used in audio equipment in exactly this way.
high-pass
R
Vin Vout
L
Figure 9.5: An RL high-pass filter. Inductors present a
low reactance to low frequency signals, so they are selec-
tively returned to ground before the output.
Vout
The cutoff frequency of the RL filter can be determined just like we did above for the RC filter
– when the reactance of the inductor equals the resistance, half the power goes to each component.
XL = 2πfcutoff L = R (9.19)
R 1
=⇒ 2πfcutoff = = (9.20)
L τ
Vout 1
=√ at the cutoff frequency (9.21)
Vin 2
The frequency response of this filter is shown in Fig. 9.5. Once again, the cutoff frequency is
just the inverse of the time constant, since the frequency- and time-domain descriptions are inverse
points of view.
There is much, much more we can do with ac circuits, we have only just scratched the surface.
Now it is time to move on once again, and work our way away from electricity and magnetism
toward optics. Of course, optics is also electricity and magnetism, as we shall see!
Since electric and magnetic fields in a volume of space represent energy, whenever a charged particle
accelerates, it radiates energy.
In order to get an idea of how electromagnetic waves work, we will consider a simple antenna
connected to an alternating voltage source, Fig. 9.6. The alternating (sinusoidal) voltage source
applied to the two antenna wires causes electric charges in the wires to oscillate (this is basically
how a broadcast antenna works). For the sake of argument, the alternating voltage source has a
period T , and a frequency of f = 1/T .
+ - +
+ - +
+ - +
At time t = 0, Fig. 9.6a, the upper rod is at a maximum positive voltage, and the lower a
maximum negative voltage. Thus the upper rod is given a maximal positive charge, and the lower
rod a maximal negative charge. The electric field at this instant is pointing downward. As the
voltage source oscillates, the voltage and amount of charge on each rod decreases, reaching zero at
one quarter of the source’s period T (t = T /4) as shown in Fig. 9.6b. At this point, E = 0 at the
antenna. The maximum E field created T /4 seconds earlier, however, has not disappeared! It has
travelled at a velocity c for T /4 seconds, so it is cT /4 meters away from the antenna. Remember,
time-varying E field travels from the antenna at the speed of light c.
At a still later time T /2, Fig. 9.6c, the voltage source has completed one half cycle, and has
reversed polarity. Now the E field at the antenna is reversed in direction, and again at a maximal
value. This continues on, Fig. 9.6d, and the E field at the antenna oscillates in phase with the
induced charge distribution as we would expect. At any instant, the E field at the antenna depends
on the charge on the rods at that instant, and therefore the voltage applied by the source.
Basically, we have set up a charge distribution which oscillates in time, just like a mass on a
spring oscillates. Since the motion is oscillatory, we known that the charges are accelerating, and
therefore, radiating energy. One part of the radiation is just the E field traveling out from the
source at a velocity c.
While the charges oscillate, they also constitute a time-varying current in the rods. The current
is maximal when the voltage and E field are maximal, since that is the moment when the most
charge is moving in the smallest amount of time. Likewise, when E = 0, the current is zero. The
presence of a current means there is also a magnetic field created, as shown in Fig. 9.7.
- I + I
- +
- ! + !
B B
!
E Figure 9.7: Magnetic fields around an antenna carry-
!
E ing an alternating current for two different times in the
current cycle: (a) t = T /2, and (b) t = 0, T . The cur-
rent is in phase with the voltage source. Note that this
+ - is true whether the antenna is producing the radiation or
+ - receiving it!
+ -
T t=0
t= t=T
2
(a) (b)
The magnetic field oscillates in time, in phase with the current and the E field. By the right
hand rule, B ~ is always perpendicular to E.
~ This is the other part of the radiation of the accelerating
charges, the B field traveling out from the source at a velocity c.
The basic result of this is that changing magnetic fields produce an electric field, and changing
electric fields produce a magnetic field. These induced electric and magnetic fields are always in
phase (they reach maximum and minimum values at the same point), and the fields are at right
angles.
Electromagnetic waves travel with the speed of light, which in fact relates the permeability
of a medium (1/ relates to the strength of E), the permittivity µ of a medium (µ relates to the
z x
Figure 9.8: An electromagnetic wave at one instant of
!
E time, moving in the positive x-direction with speed c. The
electric field points along the y axis, and is perpendicular
to the magnetic field at every point. Both E ~ and B~ are
!
B perpendicular to the direction of wave propagation.
|!v | = c
1
c= √ = 2.99792 × 108 m/s (9.22)
µ0 0
~ and |B|
Relationship between |E| ~ in an EM wave:
~
|E|
c= (9.23)
~
|B|
~ and B
Just like in the mass spectrometer (Sect. 7.3.1.1), perpendicular E ~ fields imply a particular
~ B|.
velocity, and given that EM waves travel at c, this implies a fixed ratio |E|/| ~
Accelerating charges radiate EM waves, which really means they radiate energy. How much energy?
For a given EM wave, we can define an intensity of radiation I which is the amount of energy
~ and |B|)
absorbed per unit time, per unit (surface) area. Since the intensity of E and B (|E| ~ vary
in time, and clearly the amount of radiation absorbed depends on how big the surface area is, this
is the best we can do. Of course, energy per unit time is just power, so really I is power per unit
area [W/m2 ].
Intensity of EM radiation
Here Emax is the maximum of the E field in the wave (the amplitude), and Bmax is the
amplitude for B. The units of I are then Watts per square meter [W/m2 ].
U = (energy per unit time per unit area) · (area) · (time) = I · A · ∆t (9.25)
So the larger the area, and the longer the time of exposure, the more energy that is transmitted
by radiation. This much we already know first-hand during the Alabama summer.
If energy is transferred, then a momentum p must also be transferred. Though this may not
seem intuitive, incident EM radiation (light) imparts momentum on anything it strikes and transfers
energy to. Clearly this is a small effect, since we do not notice it being harder to walk toward or
away from the sun! If we take a perfectly black surface, which absorbs all incident energy, it turns
out the momentum transfer is:
U I · A · ∆t
|~
p| = = complete absorption (9.26)
c c
This is the analogy of a perfectly inelastic collision (like one mass striking another and sticking
to it). If all radiation is reflected, the analogy of a perfectly elastic collision, then the momentum
transfer is twice as big. Rather than the incident EM wave simply being absorbed, which changes
its velocity from c to zero, now it is reversing direction, which changes its velocity from +c to −c.
2U 2I · A · ∆t
|~
p| = = complete reflection (9.27)
c c
We will come back to the subject of radiation pressure and the momentum of light in later
chapters, and we will be able to more carefully explain why these formulas must be the way they
are.
The momentum imparted by EM radiation is known as radiation pressure. If we remember that
force can be defined as a change of momentum (F = ∆p/∆t), and pressure is force per unit area
(P = F/A):
EM Radiation Pressure:
I E2 B2
Pradiation = = max2 = max complete absorption
c 2µ0 c 2µ0
2I E 2 2
Bmax
Pradiation = = max = complete reflection (9.28)
c µ0 c2 µ0
Direct sunlight on Earth only imparts a momentum of about 5 µN/m2 , so these effects are very
small, but measurable. In the solar system, radiation pressure is an important effect, it tends to
push particles smaller than ∼ 0.1 µm outward from the sun.
Electromagnetic waves cover many orders of magnitude in frequency and wavelength, but always
obey this relationship. Since the velocity of light in vacuum c is fixed, this means if you know either
f or λ, you automatically know the other. Figure 9.9 shows the frequency and wavelength ranges
for some types of EM waves. Note that our common definitions of wave types are not precise, and
overlap (e.g., X-rays and UV). Section 21.12 in Serway has a nice discussion of different sorts of
EM waves.
One final question: Why do microwave ovens have a screen over the door with small
holes in it? How does the screen protect us from microwave exposure, yet allow us to see
inside? Hint: look at Figure 9.9. The holes in a typical microwave screen are of order
∼ 1 mm diameter.
Figure 9.9: The electromagnetic spectrum. Note that our common definitions of wave types are not precise, and overlap (e.g.,
X-rays and UV). Note the expanded views of the visible spectrum and common communications frequencies. At the smallest
wavelengths, Ångström units are commonly used, 1Å=10−10 m. Image from L. Keiner, http: // www. keiner. us/ . 31
9.7 Problems
1. A variable-frequency ac voltage
source(circles with sine waves inside) is
hooked up to (a) a resistor R and an
inductor L, and (b) a resistor R and a
capacitor C. The resistor is the same in
both cases. A voltmeter monitors the
(a) voltage on the inductor in circuit (a),
R
L and on the capacitor in circuit (b).
V
Make a rough sketch of the relative volt-
age read by the meter as a function of the
source frequency in each case (V versus
(b) f ). Identify which one of these circuits
R the voltmeter preferentially reads low
C frequencies (“low-pass filter”), and which
V
one the voltmeter preferentially reads
high frequencies (“high-pass filter”).
2. Assume that the Sun delivers an average power (P) per unit area (A) of about I ≡ P/A =
1.00 × 103 W/m2 to Earth’s surface.
(a) Calculate the total power incident on a flat tin roof 7.17 m by 21.1 m. Assume that the
radiation is incident normal (perpendicular) to the roof.
(b) Calculate the peak electric and magnetic fields of the light.
3. A helium-neon laser delivers 1.05 × 1018 photons/sec in a beam diameter of 1.75 mm. Each
photon has a wavelength of 601 nm.
(a) Calculate the amplitudes of the electric and magnetic fields inside the beam.
(b) If the beam shines perpendicularly onto a perfectly reflecting surface, what force does it
exert?
(c) If the perfectly reflecting surface is a block of aluminum with mass m = 1 g, how long will it
take for the incident photons to accelerate it to a velocity of 1 m/s? Assume the beam does not
diverge, air resistance and gravity can be neglected.
(a) Using only these components, construct both a low and high pass filter.
(b) Sketch the frequency response for each.
5. The threshold of dark-adapted (scotopic) vision is 8.0 × 10−11 W/m2 at a central wavelength
of 550 nm. If light with this intensity and wavelength enters the eye when the pupil is open to
its maximum diameter of 8.0 mm, how many photons per second enter the eye?
1. (a) is the high-pass filter. At high frequencies, the inductor represents a large resistance
path, so high frequencies want to go to the voltmeter. At low frequencies, the inductor has very
low resistance, so low frequencies want to go back to the source and not to the voltmeter.
(b) is the low-pass filter. A capacitor has a very high resistance at low frequencies, so low
frequencies want to go to the voltmeter. At high frequencies, the capacitor has a low resistance,
so the high frequencies want to go back to the source.
Figures 9.4 and 9.5 show the frequency response for high-pass and low-pass filters.
2. If I is just power per unit area, then the first part is easy:
2
Emax
I=
2µ0 c
Use the above to get Emax , then use the fact that c = Emax /Bmax . You should get Emax =
868 V/m, Bmax = 2.89 µT.
3. (a) This is a multi-step problem, and requires a bit of thought - there is no one formula to
start with. First, we know we can relate the amplitudes of the E and B fields to the intensity of
light I. Second, we know that the the ratio of the E and B field has to give the speed of light:
2
Emax
I = and
2µ0 c
~
|E|
c =
~
|B|
Now, how does this help us? We first have to think about what intensity I is - energy per unit
time per unit area. Since light energy of a single wavelength - like we have here - comes as
individual photons, the energy delivered per second has to be the number of photons per second
times the energy per photon:
photons energy 1
I=
time photon Area
hc
Ephoton = hf =
λ
Here we used the relation between wavelength λ, frequency f , and the speed of light c: λf = c.
We are given the number of photons per second, let’s call this N. Putting together our formulas:
photons energy 1
I =
time photon Area
hc 1
= N· ·
λ Area
2
Emax
=
2µ0 c
Here A is just the area of the beam, which is easily found from the diameter given. Now we
know everything above except Emax , so we can solve for that:
2 2µ0 hc2 1
Emax = ·N · 2
λ π (d/2)
2
4π × 10−7 N/A2 · 6.63 × 10−34 J · s · 3 × 108 m/s
1
· 1.05 × 1018 1/sec ·
= −9 2
601 × 10 m π (8.75 × 10−4 m)
2
N·J 8 N·J·s 8 N J
= 1.09 × 108
2 2
= 1.09 × 10 2 = 1.09 × 10 2 · N·m
A ·m·s C ·m C
2
N
= 1.09 × 108 2
C
=⇒ Emax = 1.0 × 104 N/C = 1.0 × 104 V/m
Here we used the conversions 1 J = 1 N · m, 1 C = 1 A/s, and 1 N/C = 1 V/m. We can now find
Bmax easily, once again noting that V · s = T · m2 :
Emax
Bmax =
c
1.0 × 104 V/m
= 2
(3 × 108 m/s)
V·s
= 3.5 × 10−6 2
m
Bmax = 35 µT
(b) So. How do we get from intensity to pressure? Photons have momentum, remember from
relativity that E = |~p|c. Using this and the formula for intensity above, we figured out that for
a perfectly reflecting surface, the radiation pressure can be found via:
2I E2
Prefl = = max
c µ0 c2
From the known pressure P , we can easily get the force, since pressure is force per unit area:
F = P ·A
2
Emax A
= 2
µ0 c
2
1.0 × 104 N/C 2
= 2 2 · π 8.75 × 10−4 m
4π × 10−7 N/A · (3 × 108 m/s)
N2 · A2 · s2 · m2
= 2.3 × 10−9
N · C2 · m2
−9
F = 2.3 × 10 N
(c) Given a (constant) net force, we have a (constant) net acceleration. Given a constant
acceleration, we can find the velocity at any later time. First, the acceleration:
F 2.3 × 10−9 N
a= = = 2.32 × 10−6 m/s2
m 10−3 kg
Remember that 1 N=1 kg·m/s2 . Assuming the block to be initially at rest (v0 = 0), ignoring air
resistance, gravity, friction, etc., we now find the time it takes the block to reach a final velocity
of vf = 1 m/s.
= v0 + a · ∆t
vf
= 0 + a · ∆t
1 m/s
vf 1 m/s
=⇒ ∆t = =
a 2.32 × 10−6 m/s2
∆t = 4.3 × 105 s ≈ 5 days
4. Have a look at the figures below. Remember that an inductor presents a high resistance to
high frequency signals, and lets low frequency signals through easily. Therefore, for the high-
pass filter, we use the inductor to “short out” low frequency signals to the ground, and let high
frequency signals through to the output.
Capacitors, on the other hand, present a high resistance to low frequency signals, and let high
frequency signals through easily. Therefore, for the low-pass filter, we use the capacitor to “short
out” high frequency signals to ground, and let low frequency signals through to the output. The
frequency response is as illustrated schematically.
See the course notes packet for further details, including how to select the actual component
values to determine the range of frequencies filtered.
5. 11100 photons/s
high-pass
low-pass
R R
Vin Vout Vin Vout
C L
Vout
Vout
f
f
Optics
294
10
Reflection and Refraction of Light
L IGHT as we know it is nothing more than a particular sort of electromagnetic wave, defined
by a rough range in frequency or frequency. Visible light covers wavelengths of ∼ 400−700 nm,
while ultraviolet (UV) and infrared push this definition to ∼ 10 nm−10 µm.
In the next chapters when we discuss optics, we will focus on applications to visible light, though
everything we discuss will be applicable to the whole electromagnetic spectrum, from radio waves
to gamma-rays.
295
296 10.1 The Nature of Light
Photon energy:
hc
E = hf = (10.1)
λ
c = λf (10.2)
(a)
v=c
(b)
Figure 10.2: A “wave packet” in some sense com-
bines particle and wave properties. (a) A Gaussian wave
2
packet, y = e−x sin x, illustrating how a modulated wave
can “look like a particle.” The localized bundle of high in-
(c) tensity can behave like a particle. (b-d) A series of wave
packets more and more closely spaced together, mimick-
ing the transition from a stream of photons to a ray of
light.
(d)
Almost all of the intensity is within the central region, hence the term wave packet. Light
waves of this sort mimics the behavior of single particles. If you used a light detector to measure
such a wave, you would observe discrete “ticks” corresponding to the wave packets, and nothing in
between. So this isn’t so weird and mysterious at all - light is a wave, and it can behave like particles
just because the waves aren’t simple sin’s and cos’s! The waves have regions of localized intensity,
which can be thought of as particles. As it turns out, when we discuss Quantum Mechanics, this
is also the appropriate point of view for any other particle, such as electrons.
For the rest of this chapter, we will view light as a steady stream of particles.
If we view light as a steady stream of particles, we can consider these streams of light particles
to be “rays” which travel in straight-line paths - the so-called ray approximation. Since the
speed of light is constant, light particles can have no acceleration, and therefore (by Newton’s first
law) must travel in straight-line paths. Our ‘light rays” are the paths of individual photons (or
wave packets) if you like, or the “wave front” connecting the points of all EM waves with the same
amplitude and phase. But more on that later.
When a light ray traveling in a (transparent) medium encounters a boundary into a second medium,
part of the ray is reflected back into the first medium. Figure 10.3a illustrates several light rays
“bouncing” off of a perfectly flat surface (or an interface between two media). The rays are all
parallel as they are incident on the surface, they all reflect at the same angle known, and leave
parallel. This is known as specular reflection.
On the other hand, if the surface (or the interface between two media) is rough, the reflected
light comes out in many directions. This is known as diffuse reflection, and is shown in Fig. 10.3b.
A surface is considered “smooth” and behaves as such so long as the roughness is small compared
to the wavelength of the light. If the roughness is small compared to the wavelength, the light
cannot “see” it.
Law of Reflection
When a light ray is reflected off of a surface, the angle of incidence θi is equal to the angle
of reflection θr .
(a)
specular
normal
For a single ray, the angle of reflection equals the angle of incidence, as illustrated in Fig. 10.4.
Experimentally, this is true, and it also follows from the boundary conditions on the E and B fields
(Appendix B). Reflective optics is pure geometry - no matter how many media you consider in
sequence, it all boils down to geometry.
normal
θ1 θ1’
Figure 10.5: Light rays incident on an air-glass in-
terface. The refracted ray is bent toward the interface
air v1 normal because v2 < v1 .
glass v2
θ2
refracted ray
When light encounters a new medium in which its velocity is changed, adjusts its direction in
such a way to spend more time in the medium with the higher velocity. The result of this is that
when a light ray encounters a boundary between two transparent media, part of the ray is reflected,
and part of it is bent as it enters the second medium, as shown in Fig. 10.5. The “bent” beam
is said to be refracted. The incident, reflected, and refracted beams all lie in the same geometric
plane along with the interface normal at the point of incidence.
If we first consider the case v2 < v1 , as shown in Fig. 10.5, the light ray is going slower in the
second medium, so it would like to minimize the distance through that media it has to cover. That
is accomplished by traveling through the second media more closely to normal incidence than before
- the light ray bends toward the surface normal to get out of the second medium more quickly.
A slightly more formal way to state this is through Fermat’s principle, or the “principle of least
time”:
This principle is sometimes taken as the definition of a ray of light. Using calculus, one can show
that the angle θ2 “chosen” by the ray of light depends on the angle of incidence and the velocities
of light in the two media:
sin θ2 v2
= = constant (10.3)
sin θ1 v1
The slower the velocity of light in the second medium, the more sharply the light ray bends
toward the normal - the light minimizes its time spent in the slower medium by shortening its
path.
On the other hand, what if if v1 > v2 ? We could say that the light ray is going more quickly in
the second medium, and can afford to spend a bit more time after going so slowly in medium 1. An
easier way to think about it is that for refraction we can run the light rays forwards or backwards,
and we have to get the same result. That is, the path of a light ray through a refracting surface
is reversible. If light travels through the air and bends into the glass as shown, then light coming
from the glass into the air has to behave the same way.
In any case, the light rays bend closer to the normal if they enter a region where they have lower
velocity, and away from the normal in a region where they have higher velocity, as shown below.
Light wants to get out of regions of low velocity by traveling more normal, and spend more time
in regions of high velocity by making a more shallow angle.
(a) (b)
normal normal
θ1’
glass v1
θ1 θ1 Figure 10.6: (a) When light moves from air into glass,
its path is bent toward the normal since the velocity of
air v1 light is reduced in glass compared to air. (b) When light
glass v2 air v2 moves from glass into air, its path is bent away from
the normal, since it has now entered a region of lower
θ2
θ2 velocity.
v2 < v1 v2 > v1
θ1 > θ2 θ1 < θ2
For convenience, we often define a material constant which represents the ratio between the speed
of light in vacuum and in a given material, the index of refraction n:
Index of refraction n:
The last two relationships come from Eq. 9.22, relating the speed of light to the permeability
(µr ) and permittivity (r ) or dielectric constant (κ) in a material. The index of refraction is
dimensionless (it has no units), and of course n = 1 for vacuum itself. If we use this definition, we
can rewrite Eq. 10.3:
sin θ2 v2 n1
= = (10.5)
sin θ1 v1 n2
The index of refraction for many common conducting materials is listed in Table 10.1, compiled
from several sources. 32
Material n Material n
Vacuum 1 (exactly) Helium 1.000036
Air (STP) 1.0002926 Carbon Dioxide 1.00045
Water ice 1.31 Liquid water (20◦ ) 1.333
Acetone 1.36 Teflon 1.35-1.38
Glycerol 1.4729 Acrylic glass 1.490-1.492
Rock salt 1.516 Crown glass (pure) 1.50-1.54
Salt (NaCl) 1.544 Polycarbonate 1.584-1.586
Flint glass (pure) 1.60-1.62 Crown glass (impure) 1.485-1.755
Bromine 1.661 Flint glass (impure) 1.523-1.925
Cubic Zirconia 2.15-2.18 Diamond 2.419
When light travels from one medium to another, its speed changes, but its frequency does not.
Examine Fig. 10.7. When a wave passes from material 1 to material 2, the frequency at which
waves arrive at the boundary from 1 must equal the rate at which waves leave the boundary into
2. If they were not equal, since EM waves carry energy, we would have to create or destroy energy
at the boundary. This is clearly not OK. If the energy has to be conserved across the boundary,
then by Eq. 10.1, the frequency must be conserved too. So f1 = f2 ≡ f .
c
n2 =
v2
λ2
2 v2 Figure 10.7: When a wave moves between medium 1
1 v1 and medium 2, its wavelength changes, but its frequency
λ1 does not.
c
n1 =
v1
On the other hand, we know from Eq. 9.29 that the speed of light must be related to frequency
and wavelength:
v1 = f λ1 and v2 = f λ2 (10.6)
Putting together what we know, we can relate the changes in frequency and speed to the index
of refraction in the two media:
λ1 v1 c/n1 n2
= = = =⇒ λ1 n1 = λ 2 n2 (10.7)
λ2 v2 c/n2 n1
Put another way, λ times n is a conserved quantity for light. Really, this is just a restatement of
Eq. 10.1 plus conservation of energy - E = hc/λ = hnv/λ. If we say medium 1 is vacuum, such that
n1 = 1, we can relate the index of refraction to the change in wavelength when entering a medium:
λ0
n= (10.8)
λn
where λ0 is the wavelength of light in vacuum, and λn is the wavelength of light in the medium
whose refractive index is n. In the end, our most important conclusion is the following. When light
leaves one medium, of refractive index n1 , and enters another, of refractive index n2 , then:
where the angles θ1,2 are measured with respect to a line normal to the boundary.
is called dispersion.
The phenomena of dispersion is nicely illustrated by prism, as shown in Fig. 10.9. A light ray of
a single wavelength will pass through the prism, but it will be slightly bent on entering and leaving
the prism, Fig. 10.9a. The angle of the exit ray compared to the incident ray is known as the angle
of deviation δ.
(a) (b)
Figure 10.9: Dispersion of light by a prism. (a) A
δblue prism refracts an entering light ray both on entering and
δred leaving. (b) Due to the wavelength variation of n, blue
δ light is bent more than red. (c) White light entering
a prism is therefore dispersed by wavelength, yielding a
spectrum. Violet light is bent the most, red light the least.
If we shine a ray of white light on the prism, something interesting happens. White light is
nothing more than a combination of all the visible colors of light in equal proportions. When the
white light passes through the prism, the blue light will be bent more than the red ones, and the
colors of light become spatially separated, Fig. 10.9b. The result is a display of all the colors of
the visible spectrum, Fig. 10.9c. The colors, in order of decreasing wavelength, are red, orange,
yellow, green, blue, and violet (Roy G. Biv in mnemonic form). Violet light deviates the most, red
the least, and the rest fall in between. Of course, other non-visible wavelengths of light are bent
too - ultraviolet (UV) rays would be bent still more than violet, and infrared even less than red -
we just can’t see them with the naked eye.
Before Newton’s studies of optics, most scientists believed that white was the true color of light,
and other colors were formed only by adding something to it. Newton demonstrated this was not
true by the use of prisms. His experiment was to pass white light through a prism, then direct the
individual colored beams through another prism. If light were really white, and the colors were
just added by the prism, the second prism should have added further colors to the single-colored
beams. Since the single-colored beam remained a single color, Newton concluded that the prism
actually separated the colors already present in the light. White light is the effect of combining the
visible colors of light in equal proportions.ii
10.3.3 Rainbows
Rainbows are a natural form of light dispersion, in which water droplets in the atmosphere act as
tiny prisms. A ray of white light in the atmosphere strikes a (quasi-circular) water droplet, and is
refracted and reflected as shown in Fig. 10.10.
sunlight
When the light ray reaches the drop, it is first refracted at the front surface of the drop, which
causes dispersion. Violet light deviates the most, red light the least. The separated rays then
reflect off of the back surface (all wavelengths reflect at the same angle), and again reach the back
surface of the drop. The individual rays undergo refraction as they leave the drop, and overall the
red rays are bent by about 42◦ relative to the incident rays, and the violet are bent by about 40◦ .
The dispersion is small, but this is what results in a rainbow - incoming sunlight is reflected back
over a range of angles.
Incidentally, the light at the back of the raindrop does not undergo total internal reflection, as
you might think, and some light does emerge from the back. This transmitted light doesn’t create
ii
en.wikipedia.org/wiki/White
white light
white light
○
42 ○
40 Figure 10.11: Illustration of rainbow for-
mation. Drawings by Christine LeClair.
How to we end up observing a rainbow from this small dispersion? Under the right conditions,
rain drops are present in the atmosphere. A raindrop high in the sky appears red, because red
light is deviated the most and actually reaches the observer. Other colors of light pass over the
observer’s head. For slightly lower drops, only the yellow rays are deflected at just the right angle
to reach the observer. Finally, the lowest observable drops direct violet light to the observer, and
disperse other wavelengths below the observer - the red light would just strike the ground and
not be observed. So when we observe a rainbow, we are really seeing the ≈ 2◦ angular dispersion
created by tiny water drops in the sky.
As it turns out, the rainbow formation process is independent of how big the drops are, but does
depend on the refractive index of the drops. For instance, seawater has a higher refractive index
than rain water, so the radius of a rainbow in sea spray is smaller. If you observe a rainbow on a
rainy near sea spray, this effect is visible as a misalignment of the ’rain’ and ’sea’ bows. A good
example of this, and many other interesting atmospheric optical phenomena, can be found here:
http://www.atoptics.co.uk/.
Of course, rainbows don’t actually exist at some point in the sky, they are an optical phenomena
whose apparent position depends on the observer’s location and the sun’s position in the sky. All
raindrops reflect and refract light in the same way, but the only under the right circumstances to
the dispersed rays reach the observer’s eye. The position of a rainbow in the sky is always in the
opposite direction of the Sun for the observer - that is, you need the sun at your back. The bow will
be centered on the shadow of your head, and appears at an angle of 40-42◦ above the line between
your head and its shadow. As a result, if the sun is higher than 42◦ in the sky, the rainbow would
be formed below the horizon, and would not be visible. This is not strictly true if you are high
above the ground, however.
Figure 10.12 shows a “double” rainbow - a primary rainbow, along with a weaker secondary with
its colors reversed. Secondary rainbows are caused by sunlight reflecting twice inside the raindrops,
instead of just once, and the dispersion is roughly twice as large, appearing at ≈ 50 − 53◦ . As
a result of the second reflection, the colors of a secondary rainbow are reversed compared to the
primary. The dark area of unlit sky between the primary and secondary bows is called Alexander’s
band, after Alexander of Aphrodisias who first described it.
normal
The critical incidence angle is when the refracted beam would make an angle of 90◦ with the
We can easily solve for the critical angle of incidence, above which total internal reflection occurs:
n2
sin θc = for n1 ≥ n2 (10.11)
n1
This is only valid when n1 > n2 , because total internal reflection only occurs when light tries to
move from a medium of higher refractive index to one of lower refractive index. If n1 < n2 , the
formula would give sin θc > 1, which is impossible. In that case, the math tells us that what we
are proposing is physically not possible.
Internal reflection in prisms is a very useful way to “guide” light to where you want it, as in a
periscope. Figure 10.14 shows a few uses of prisms.
45
o
periscope.
total internal
reflection
acceptance Figure 10.15: Light propagating through
cone an optical fiber. The cladding material has a
lower refractive index than the core material,
so within a range of angles (acceptance cone)
core incident light is confined within the fiber by
cladding total internal reflection.
1. In experimenting with a beam of white light and an acrylic prism, you found that the critical
angle for total internal reflection for red light was less than that for blue light. What does this
imply about the difference betwen the index of refraction for red and blue light (nr and nb ,
respectively) in the acrylic?
nr < nb
nb < nr
nr = nb
nothing, one also needs the wavelengths
2. As light travels from a vacuum (n = 1) to a medium such as glass (n > 1), which of the
following properties remains the same?
wavelength
wave speed
frequency
none of the above
14◦
28◦
16◦
42◦
5. If the thickness of the middle layer in the figure above is 2 cm (0.02 m), how long does it take
for the light to pass through the transparent medium?
7.2 × 10−11 s
2.5 × 10−9 s
1.3 × 10−10 s
5.8 × 10−8 s
2 × 1030
5 × 10−29
1 × 1015
7 × 1018
7. A pulsed ruby laser emits light at 694.3 nm. For a 13.6 ps pulse containing 3.40 J of energy,
how many photons are in the pulse? 1 ps is 10−12 s.
2 × 1020
1 × 1019
3 × 1021
5 × 1017
10.6 Problems
4. As light from the Sun enters the atmosphere, it refracts due to the small difference between
the speeds of light in air and in vacuum. The optical length of the day is defined as the time
interval between the instant when the top of the Sun is just visibly observed above the horizon,
to the instant at which the top of the Sun just disappears below the horizon. The geometric
length of the day is defined as the time interval between the instant when a geometric straight
line drawn from the observer to the top of the Sun just clears the horizon, to the instant at
which this line just dips below the horizon. The day’s optical length is slightly larger than its
geometric length.
By how much does the duration of an optical day exceed that of a geometric day? Model
the Earth’s atmosphere as uniform, with index of refraction n = 1.000293, a sharply defined
upper surface, and depth 8767 m. Assume that the observer is at the Earth’s equator so that
the apparent path of the rising and setting Sun is perpendicular to the horizon. Express your
answer to the nearest hundredth of a second.
5. A cylindrical cistern, constructed below ground level, is 2.9 m in diameter and 2.0 m deep
and is filled to the brim with a liquid whose index of refraction is 1.5. A small object rests on
the bottom of the cistern at its center. How far from the edge of the cistern can a girl whose
eyes are 1.2 m from the ground stand and still see the object?
p q
7. Use the figure at right to give a ge- P Q P!
ometrical proof that the virtual image
formed by a flat mirror is the same dis- h R h!
tance behind the mirror as the object is θ θ
object θ image
in front of it, and of the same height as
the object.
mirror
1. nr < nB . The critical angle for total internal reflection is given by Snell’s law: nprism sin θC =
nair sin 90◦ . Since the right side of this equation is the same for both red and blue light, we know
that nprism, red sin θC,red = nprism, blue sin θC,blue , or the product nprism sin θC must be constant.
Therefore, if the critical angle is greater for red light than for blue, then the sin of its angle
must be also be greater, and the index of refraction for red light must be smaller for the product
nprism sin θC to be the same for red and blue light.
3. θ1 . Apply the law of refraction twice, once at each interface. At the top interface, n1 sin θ1 =
n2 sin θ2 . At the bottom interface, n2 sin θ2 = n1 sin θ3 . Therefore, sin θ1 = sin θ3 or θ1 = θ3 .
n1
4. 14◦ . Using the equations from the previous answer ... sin θ2 = n2 sin θ1 . Plugging in the
numbers given, one should get 14◦ .
5. 1.3 × 10−10 s The time taken is simply the distance traveled in the middle layer divided by
the speed of light in medium 2. Let the thickness of the middle layer be d. Geometry tells us
that the distance the light travels in medium 2 is l = d/ cos θ2 ≈ 0.021 m. The speed of light in
the medium is v2 = c/n2 ≈ 1.56×108 m/s, so the time taken is l/v2 ≈ 1.3 × 10−10 s.
6. 2 × 1030 photons/sec.
7. 1 × 1019 photons.
1. Note: Ultrasonic waves are NOT light. But! They are waves, so we can apply our optics
knowledge without problem. More on this in class.
Reference the modified figure below. We can use Snell’s law at the air-liver interface. Let n1 be
the refractive index for the surrounding medium, and n2 be the refractive index for the liver.
n1
n1 sin 50◦ = n2 sin θ1 ⇒ sin θ1 = sin 50◦
n2
12.0 cm
o
50.0
d θ1
Liver
6.00 cm
Tumor
6 6
tan θ1 = ⇒ d=
d tan θ1
Next, we need n1 /n2 . Recall the definition of the index of refraction - it is just proportional to
1/v, where v is the velocity in the media. Therefore, since we are told v2 = 0.85v1 :
n1 v2
= = 0.85 ⇒ θ1 = sin−1 [0.85 sin 50] ≈ 40.6
n2 v1
Put what we have together ...
6 6
d= = ≈ 7 cm
tan θ1 tan 40.6
3. One ray at a time. If we calculate the angle of deviation for red light, and then for violet, we
can just subtract those two extremal angles to find the angular dispersion. First, we will need
quite a bit of plane geometry. Reference the figure below.
60
θ4 a+b
b
θ1 a b
θ2 θ3
60 60
The first deviation the light ray experiences is the angle a on entering the prism, and then the
angle b on exiting. The total deviation is then a + b. Now look at the triangle formed by the
red line inside the prism and the top part of the prism. For this triangle, the angles are 90 − θ2 ,
90 − θ3 , and 60◦ . All the angles in a triangle must sum to 180◦ :
90 − θ2 + 90 − θ3 + 60 = 180 ⇒ θ2 + θ3 = 60
Now, note that θ1 = a + θ2 , and θ4 = b + θ3 . We can now combine all our relationships and write
down the angular deviation for a single ray:
deviation = a + b = θ1 − θ2 + θ4 − θ3 = θ1 + θ4 − 60
We are given θ1 . We can find the other θ’s with Snell’s law at the two air-prism interfaces. Let
n1 = 1 be the air, and n2 be the index of refraction for either red or violet light in the prism:
n1 sin θ1 = n2 sin θ2
n2 sin θ3 = n1 sin θ4
−1 n1
⇒ θ2 = sin sin θ1
n2
θ3 = 60 − θ2
Now a bit more algebra gives us θ4 . We don’t want to plug in numbers until the very end, since
we have to do this calculation twice - once for red light and once for violet. So we will keep
everything in symbols until until the bitter end.
n2 n2
sin θ4 = sin θ3 = sin 60 − θ2
n1 n1
n2 −1 n1
⇒ sin θ4 = sin 60 − sin sin θ1
n1 n2
The deviation angle we want is just a + b = θ1 + θ4 − 60. We are given θ1 = 50◦ , so we have
a+b = θ4 −10. Plug in the values of n2 corresponding to red and violet light to find the deviation
for red and violet light, noting n1 = 1:
dev. = a + b = θ4 − 10
θ4,red = 58.55
⇒ dev. red = 48.55
θ4,violet = 63.17
⇒ dev. violet = 53.17
Finally, the angular dispersion is the difference in deviation angles between violet and red light.
5. First thing you need to do on a problem like this: draw a little picture to figure out what’s
going on. Really, it helps. Below is my attempt.
1
90-θ
l
θ1
θ2
h
d/2
Now the problem is a bit more clear. The light from the bottom of the cistern goes up through
the water, and at the water-air interface, is refracted away from the normal. Using the angle
and distance definitions in the figure, we can first use geometry to find expressions for the two
angles θ1 and θ2 :
l
tan θ1 =
x
d/2 d 2.9
tan θ2 = = =
h 2h 4
=⇒ θ2 = 35.9◦
Next, we can use the law of refraction to find another relation between the two angles θ1 and
θ2 . The index of refraction of the liquid in the cistern is nliquid = 1.5 and the index of refraction
of air is nair = 1:
l
x =
tan θ1
1.2 m
=
0.538
= 2.23 m
6. In order for only red light to come out, we have to have the blue light totally internally
reflected within the glass, but not the red. This is indeed possible for some range of angles,
since the index of refraction for blue light is higher. Total internal reflection for blue light takes
place when:
Since this θc is now also the same minimum angle of incidence for the red light, the minimum
refracted angle θr of red light is given by:
One does not necessarily need the substitution sin θc = 1/nblue above – in practice at that point
you already know θc , so you can just calculate sin θc directly. It does make the result much
more elegant and comprehensible though – one sees that the two angles calculated above are
not really independent, but both simply determined by the refractive indicies. So yes, there are
style points in physics too.
7. There would be a lot of leeway given on this on sort of problem on an actual exam - a
bulletproof, strict geometrical proof would not be necessary to get most of the credit, so long as
your logic is correct and you make a reasonable case. I’ll sketch how one may go about a proof
below, just to give you an idea.
The two angles on the left side of the mirror labeled θ are equal based on the law of reflection.
Using wave optics, we can prove that this must be so, but you were given the law of reflection,
and may take it as fact. The angle θ on the right side of the mirror must also be the same, since
it is an alternate interior angle of the lower left θ. You were given this much, and could assume
that all the θ angles are identical and start from there.
Now, the line PQP0 connecting the tip of the image and object arrows is, by construction, per-
pendicular to the mirror itself, and therefore parallel to the horizontal line connecting the bases
of the arrows as well. At this point, it is already obvious that h = h0 in fact.
Given that PQP0 and the mirror are perpendicular by construction, then ∠PQR and ∠P0 QR
are right angles. Further, since PQP0 and the axis connecting the bases of the arrows are par-
allel, then the angle θ at point R and ∠PRQ must sum to a right angle: θ +∠PRQ = 90. The
same must be true for ∠P0 RQ and θ: θ + ∠P0 RQ = 90. Therefore, θ + ∠PRQ = θ + ∠P0 RQ or
∠PRQ+∠P0 RQ.
Now consider the triangles 4PQR and 4P0 QR. These two triangles have two equivalent angles
(∠PRQ = ∠P0 RQ and ∠PQR=∠P0 QR) which bound a shared side (QR), and by the angle-side-
angle (ASA) theorem, the two triangles are congruent. Therefore, h = h0 , the image height is
the same as the object height, and PQ=P0 Q, the image is as far behind the mirror as the object
is in front of it.
T HE behavior of reflected light within the ray approximation follows from one simple principle
– the angle of incidence is equal to the angle of reflection. Everything else we need to know
about reflected light just boils down to plane geometry – so far as the physics goes, reflection is
from our point of view a solved problem! Nonetheless, we can use the law of reflection along with
some carefully applied geometry to derive the behavior of reflected light for a number of important
and often-encountered cases.
In this chapter, we will deal with the perfect reflection of light from
mirrors. Given an object and a particular sort of mirror, we will learn how
to deduce what the nature of the image formed by the mirror will be. If we
can first learn how to do this for a single point source of light, we can then
build up any more complicated object out many point sources. Our most
important example mirrors will be a simple flat mirror, a convex spherical
mirror, and a concave spherical mirror. In passing, we will also investigate
other technologically important geometries, such as the parabolic reflectors
used in satellite dishes.
More broadly, by treating the problem of reflection in various specific Figure 11.1: Total internal
reflection in the tail of a plas-
geometries, we will begin to learn about the projection, focusing, and tic monkey. Photo by the au-
manipulation of light. Combined with what we will learn about refraction thor.
in lenses in the next chapter, we will be able to understand in detail a
great number of optical instruments, such as microscopes, telescopes, and projectors.
The most simple reflecting object is just a flat mirror, as shown in Fig. 11.2. What happens if
we take a point source of light at position O, a distance p in front of the mirror? A point source
of light is just what it sounds like – a single point from which light rays leave radially in straight
lines. When the light rays exiting the source (blue) reach the surface of the mirror, we apply the
law of reflection to determine where the reflected rays go (orange). Only a few of the rays leaving
the source are drawn here.
Some rays leaving the point source source are reflected off of the surface of the mirror, and reach
an observer. The rays reflected off of the mirror in this case appear to come from a point I behind
319
320 11.1 Flat Mirrors
p q
O I
mirror
observer
the mirror, if we extrapolate where these diverging rays appear to come from (dotted orange lines).i
Any time we have an intersection of light rays, or a point where light rays appear to originate from,
an image of the object which was the source of the rays is formed. From the observer’s point of
view, the rays reflected off of the source object at O appear to come from a point I behind the
mirror, so we would say that the view sees an image of the object at point I, a distance q behind
the mirror.
Remember, for reflection and refraction, we have to be able to run the rays forwards or backwards
and get the same result. If we trace the light rays from the object to the observers eyes, this is
of course the real path the rays take. Tracing the orange rays backward through the mirror to
find their point of convergence tells us where we would need a second point source to reproduce the
image observed. All real and virtual light rays fall into two categories – ones that converge onto a
point (either the image or the object), and ones that diverge.
Image formation:
Images are formed where light rays converge to a point (intersect), or where they appear
to originate from.
If the original point source is a distance p from the mirror, straightforward geometry tells us
that the image distance q must be the same, p = q. The image observed is exactly as far behind the
mirror as the object is in front of it. The image in this case is what is known as a virtual image
– light doesn’t actually pass through the point where the image is created, but only appears to
come from that point. A real image is formed when light actually passes through some point. Real
images can be projected onto a screen, for example, since they result from real light sources, while
virtual images cannot (hence the term “virtual”).
i
Since this is not a real light ray anyway, we do not worry about refraction in the glass making up the mirror. We
further assume the mirrors to be negligibly thin in any case.
p q
O I
mirror
Virtual image: Light rays don’t actually pass through an image point, but appear to
originate from there.
Real image: Light rays actually pass through a point. Only real images can be projected
onto a screen.
Our flat mirror forms a virtual image, since the image an observer sees is behind the mirror, and
does not result from real light rays coming from the point of the image. The virtual image is just
where the actual object appears to be after the mirror reflects light rays coming from it. Images
from flat mirrors are always virtual. Can we determine anything else about the image? Is the image
of the same size and shape as the object? Can we more rigorously prove our assertion that p = q.
Sure. How do we deal with more complicated objects, as opposed to simple point sources?
p q
P Q P! Figure 11.4: The location and size of a reflected image
from a flat mirror can be found with a simple geometric
construction. Trace one ray from the object perpendicu-
lar to the mirror’s surface and one ray from the object
h R h! through the origin. Real light rays reflect off of the mir-
θ θ ror, virtual light rays continue on through the mirror.
object θ image The convergence of virtual rays behind the mirror gives
the image location. Since triangles P QR and P 0 QR are
identical, the image and object heights are equal, h = h0 ,
as are the image and object distances, p = |q|.
mirror
come from behind the mirror, so we continue tracing a virtual ray (dotted orange line) behind the
mirror.
Now, we need to trace at least one more ray to uniquely determine what the image looks like.
We need to find an intersection of real or virtual rays in order to have an image, so we have to have
at least two, and in general three is safer. For the second ray, we will trace a line from the tip of
the arrow to a point on the mirror at the same vertical position as the bottom of the arrow. The
use of two extremal rays gives us more confidence in the position of the resulting image – if two
such extreme rays find an intersecting point, we are fairly sure we have found the image location.
If we chose two rays at similar angles, small inaccuracies in our drawing become more important,
and we have a harder time discerning the image position and size with any accuracy. Try tracing
some ray diagrams for yourself, you will quickly find this to be true.
This second ray is reflected downward from point R on the mirror at the same angle θ at which
it impinges on the mirror. Extrapolating the reflected ray back through the mirror as a virtual ray
(dotted orange line), we see that it converges with the first virtual ray at point P 0 . This point of
convergence, then, must be the location of the image. Furthermore, since we are tracing out rays
from the tip of the arrow, this must be the tip of the image’s arrow. Symmetry alone tells us that
the image arrow must be upright, like the real one. If you are not convinced, trace out the same
two types of rays from the bottom of the arrow, and you will see!
We have established, then, that the image is virtual, and upright (not inverted). What about
−→
its size? The virtual ray from R to P 0 , RP 0 clearly must make an angle θ with the horizontal axis,
0
since it is just a continuation of the reflected ray at point R. The lines P Q and QP are horizontal,
so the angles ∠RP Q and ∠RP 0 Q must also be θ, since they are alternate interior angles to the θ
drawn in the figure. The triangles 4RP Q and 4RP 0 Q must therefore be equivalent, since they
share RQ as a side. If these two triangles are equivalent, it clear that h = h0 , and p = q. Now
we have proved our assertion that the image formed by an object placed in front of a flat
mirror is as far behind the mirror as the object is in front of it. We have further proved
that the image is the same size as the object. The images formed by flat mirrors faithfully
reproduce objects.
Flat Mirrors:
1. The image is as far behind the mirror as the object is in front of it.
2. The image is the same size as the object.
3. The image is upright and virtual.
image height h0
M≡ ≡ (11.1)
object height h
where h is the object height and h0 the image height. For a flat mirror, M = 1.
For future convenience, we should also lay down some conventions for our ray diagrams. First,
we will always treat the mirror as the ‘zero’ for our horizontal axis. Distance is positive in front
of the mirror, and negative behind it. Real images are formed in front of the mirror, while virtual
images are formed behind the mirror (since no light goes through the mirror). The distance from
the mirror to the object will always be p, the distance to the image always q. The height of the
image will be h, the height of the object h0 .
11.1.4 Handedness
Before we move on to different mirror geometries, one last word about mirrors and handedness. You
may remember that we discussed the difference between left- and right-handed coordinate systems
in Sect. 7.1.4. You already know of course that when you look in a mirror your sense of left and
right are reversed. If you wave your right hand in the mirror, the image seems to wave its left.
Similarly, a mirror reflection is what relates left-handed and right-handed coordinate systems, or
right-handed and left-handed corkscrews. Examine Fig. 11.5, and convince yourself once again that
there is an intrinsic handedness or chirality to certain things. Only a mirror reflection can change
a left-handed to a right-handed coordinate system, no number of simple rotations will do it.
x̂
RH x̂ LH
Figure 11.6b shows a point source O placed relatively far from a spherical mirror, outside the
center of curvature. Rays leaving point O with a sufficiently small angle intersect the mirror, and
are all reflected back through a common convergence point I. The point I is the image point, and
the convergence of rays indicates that an image will form there, as though there were a copy of the
source at that point. Since real light rays are passing through the point I, the image formed is real.
For spherical mirrors in particular, we will usually assume that the light rays from the source
make a small angle with the principle axis. When this condition is met, all incident rays will reflect
back through the image point. On the other hand, when some rays reaching the mirror make a
relatively large angle with the principle axis – when the object is relatively close to the spherical
mirror – this is no longer true, as shown in Fig. 11.7. When the object is too close to the mirror,
some of the rays making a large angle with the principle axis no longer reflect back through the
image point, and no single point of convergence exists. This means that the image formed is not
clearly focused on one point, but spread out – the image is blurry. This phenomena is known as
spherical aberration. It is quite important for, e.g., telescopes and cameras – since spherical shapes
the easiest to produce, most lenses have spherical shapes and will suffer from this phenomena, as
we will see in more detail in the following chapter.
If we ensure that the object is sufficiently far from the mirror to avoid spherical aberration,
what will the image look like? Just like with flat mirrors, we will trace the rays coming from the
tip of an arrow placed in front of the mirror, as shown in Fig. 11.8. Again the arrow of height h is
placed a distance p from the mirror, at point O. The center of curvature for the mirror is C, and
the center of the mirror is at V .
First, we trace a ray from the tip of the arrow through the center of curvature at C. Since the
mirror is the arc of a circle, any line passing through the center of curvature must be normal to
the surface of the arc – that is, it must intersect the surface of the arc at a 90◦ angle. Therefore,
the ray drawn through the center of curvature reflects back along the same path. We will call the
angle this ray makes with the principle axis α.
Next, we draw a second ray from the tip of the arrow through the center of the mirror at V .
This ray makes an angle θ with the principle axis, and will reflect off the mirror at V with the same
angle. This ray intersects the first at the point I, and defines the tip of the image arrow. Since the
intersection point lies below the principle axis, the image is inverted. Further, we can already see
h
α C θ Figure 11.8: The image formed by a spher-
α h! θ ical concave mirror for objects placed outside
O V of the center of curvature C. The image is
I real, magnified, and inverted.
q
R
p
that it is not the same size as the original arrow, so the image is also magnified. Finally, it is real
light rays that are intersecting in front of the mirror, so the image formed is real.
Still, it would be nice to know exactly how big the image is, and where it is. This much we can
figure out with a bit of geometry. First, we can use the two θ angles and relate the object height
h and the image height h0 . From the triangle formed by the object arrow and the uppermost ray:
h
tan θ = (11.2)
p
Similarly, from the triangle formed by the reflection of that ray and the image arrow:
−h0
tan θ = (11.3)
q
Note that since the image arrow points downward below the principle axis, the height of the image
is negative. Some simple algebra yields the magnification of the mirror:
h h0
tan θ = = (11.4)
p q
h0 q
=⇒ M = =− (11.5)
h p
h0 q
M= =− (11.6)
h p
Here h is the height of the object, h0 is the height of the image, p is the object distance,
q is the image distance. Negative M means the image is inverted.
Assuming we know h and p to begin with, we still need one more equation in order to uniquely
determine h0 and q, the height and position of the image. For that, we can use the α angles. From
the triangle defined by the left-most α and the object,
h
tan α = (11.7)
p−R
h0
tan α = − (11.8)
R−q
We can now use the above equations for tan α along with Eq. 11.6 to find another useful equation
relating p and q alone:
h h0
tan α = = −
p−R R−q
h 0 R−q q
= − =− (using Eq. 11.6)
h p−R p
p(R − q) = q(p − R)
pR − pq = qp − qR
pR + qR = 2qp
R(p + q) = 2qp
R qp 1
= = 1 1
2 p+q q + p
2 1 1
= +
R p q
This last expression is known as the mirror equation, relates the image and object distances to
the physical radius of curvature of the mirror alone. As we shall find out shortly, this equation
is far more general than our simple derivation of it would imply. Coupled with the expression for
magnification, we can now deduce the behavior of any object with any concave spherical mirror . . .
so long as the object isn’t too close to the mirror.
Mirror equation:
2 1 1
= + (11.9)
R p q
where p is the object distance, q is the image distance, and R is the radius of curvature
of the mirror.
We have already seen that forming sharp images from a concave spherical mirror requires the object
to be relatively far from the mirror (at least outside the radius of curvature). What happens if
the object is really, really far away? Say, far enough compared to R that p is essentially infinite?
When the object is very, very far away, the incident rays are all very nearly parallel to the principle
axis. For very distant sources, any small angle away from the principle axis will result in the rays
diverging too far to hit the mirror, only those rays at tiny angles relative to the principle axis will
hit the mirror. For all intents and purposes, we can assume all rays from a very distant object
impinge on the mirror parallel to the principle axis, as shown in Fig. 11.9.
f
R
The mirror equation gives us yet more insight. If we let p tend toward infinity, then 1/p tends
toward zero. In this case, q ≈ R/2 – the image is formed exactly half way between the center of
curvature and the mirror when the object is very far away compared to R. In this special case of
a distant object, all the incident rays converge at the same point F (Fig. 11.9), which we call the
focal point of the mirror. The focal length f of a mirror is just the distance between the mirror
and the focal point on the principle axis where light from a distant object would converge. Put
another way, it is the image distance q when we allow p to tend toward infinity. Thus, for our
concave spherical mirror, f = R2
Though the focal length and radius of curvature are simply related, it is the former that you
will hear more often in optics. The focal length of a mirror is where light would focus if we had a
point source infinitely far away, and is one way of comparing the properties of different mirrors (or
lenses, as we shall see). Even though we can’t actually realize this situation, we can get far enough
away from a mirror to approximate it, and in fact, this is the regime in which we try to operate
most optical instruments. If you have any experience with photography, you are no doubt already
familiar with focal lengths. In any case: the focal length is a characteristic of a spherical mirror,
just half its radius of curvature, and it allows us to re-write the mirror equation in an ostensibly
more useful way:
1 1 1
= + (11.10)
f p q
where p is the image distance, q is the object distance, and f is the focal length. For a
concave spherical mirror, the focal length is half the radius of curvature, 2f = R.
The fact that spherical mirrors focus all distant light onto a single point makes them potentially
useful for, e.g., solar heating or focusing antennas. As we shall see in subsequent sections, however,
there is a still more clever geometry which is much better for light harvesting applications.
Front Back
h I C
h!
Figure 11.10: The image formed by a
O F spherical convex mirror is virtual, magnified,
and upright.
p q
For the moment, two rays are enough to grasp the nature of image formation for a convex
mirror. First, we draw a ray horizontally from the tip of our object arrow in Fig. 11.10. This ray is
reflected upward away from the object and mirror. If we trace the reflected ray backward through
the mirror, it intersects the principle axis exactly at the focal point of the mirror. Next, we draw a
ray from the tip of the arrow through the center of curvature of the mirror. In front of the mirror,
it is a real ray, while in back of the mirror it is a virtual ray. The intersection of our two virtual
rays behind the mirror gives the image location.
In this case, we can see that the image is upright, virtual, and magnified. What is the actual
image position and magnification factor? As it turns out, if we work through the geometry, the
same mirror equation is valid for convex spherical mirrors, if we keep in mind that p
and q are negative when we are behind the mirror. In this particular case for convex spherical
mirrors, h and h0 are positive, p is positive, and q is negative. Table 11.1 is a reminder of the sign
conventions we use for mirrors. Parenthetically, we note that the mirror equation also works for
flat mirrors! The radius of curvature of a flat plane is infinite, and applying this to Eq. 11.9 readily
gives p = q.
In using these rules and analyzing different situations for spherical mirrors, we can make the
some generalizations to serve as rules-of-thumb:
real, real, virtual, Figure 11.11: The type of image formed by a spheri-
inverted, inverted, upright, cal mirror depends on the location of the object relative
reduced enlarged enlarged to the center of curvature and the focus of the mirror.
For objects outside the center of curvature, the image is
C F real, inverted, and reduced. For objects between the cen-
f ter of curvature and focus, the image is real, inverted,
and enlarged. For objects inside the focus, the images
R are virtual, upright, and enlarged.
Figure 11.12 shows these three rules applied to concave and convex spherical mirrors. The first
rule just follows from our discussion of discussion of very distant rays incident on a spherical mirror
– the definition of the focal point is the point at which rays parallel to the principle axis reflect
through (virtual rays in the case of convex mirrors). The second rule follows in the same way.
The third rule is essentially the definition of the radius of curvature – any line passing through the
radius of curvature is incident normal on the surface of the mirror, and must reflect back on itself.
1 Back
Front
2
3 C F
O I I C
O F
Front Back
‘forward’ or ‘backward,’ a point source of light placed at F will produce a parallel beam of light.
Incidentally, this works in three dimensions too. A circular paraboloid, made by rotating a parabola
about its axis, is the only 3D surface for which all rays parallel to a given ray pass through the same
point after reflection by the surface. What good is this property? Well, this is how modern car
headlights use a single bulb to produce a beam of light, and it is how satellite antennas (‘dishes’)
manage to focus an extremely tiny amount of radiation into a usable signal. Make the parabola as
large as possible, collecting radiation from as large an area as possible, and it all gets focused to a
single point, enormously amplifying the intensity. The same principle is used for radio astronomy
and solar ovens.
How does this work? Geometrically, a parabola is a conic section defined as the locus of points
equidistant from a single point (the focus) and a straight line (the directrix). This is shown in
Fig. 11.14. Without loss of generality, we will take the parabola centered on the origin of an x − y
coordinate system. Let the focus F be at the point (0, f ), and the directrix be the line y = −f .
This is still perfectly general - an arbitrary point and line, since we can make f whatever we want.
Our parabola is ‘between’ the focus and directrix.
Construct a line connecting F with an arbitrary point P (x0 , y0 ) on the parabola, and a vertical
line intersecting the directrix at point D(x0 , −f ). A parabola is, as stated above, geometrically
defined as the locus of all points for which F P = P D. If we didn’t already know that, could we
figure out what curve satisfies this relationship? We can, simply calculate the lengths F P and P D
with the distance formula:
FP = PD
q q
(x0 − 0)2 + (y0 − f )2 = (x0 − x0 )2 + (y0 + f )2
x20 + y02 − 2f y0 + f 2 = y02 + 2f y0 + f 2
x20 = 4f y0
1 2
y0 = x
4f 0
Lo and behold, the curve is a parabola. One can easily repeat this calculation for a parabola
centered on an arbitrary point, the same conclusion holds: a parabola is the only curve for which
all points are equidistant from a single line and a single point. For a parabola centered on (x0 , y0 )
symmetric about the y axis (i.e., pointing upward or downward), one finds (y − y0 ) = 4f1
(x − x0 )2 .
(a) (b) D!
T!
So what? Now we can sketch a proof of the unique focal property of the parabola as well, using
the second portion of Fig. 11.14. If we can prove that a tangent line to the parabola at point P
will make equal angles with P F and P D, this is enough to prove the focal property. First, we must
figure out how to construct a tangent to the parabola at any point.ii
ii
Many of you probably realize how much easier this task would be with a bit of calculus - in fact, it is a trivial
problem if we use calculus. The geometric problem is not trivial, but worth working through if for no other reason
to emphasize the fact that parabolas are simple geometric constructions, not just abstract quadratic equations. In
our studies of optics, good geometrical insight will serve you well.
By definition, triangle 4F P D is isosceles - for a parabola, P F and P D are equal. Let point T
be the midpoint of the line connecting F and D, F D. Now the triangles 4F P T and 4T P D have
two equal sides, since F P = P D and by construction F T = T D. The perpendicular bisector F D
divides the x − y pane into two sections: all points which are nearer to F than to D, and all points
that are nearer to D than to F . Except for point P , every point on the parabola itself lies closer
to F than to D by virtue of being above the line P T .
Let B be any other point on the parabola, and B 0 the point nearest to it lying on the directrix.
0
The line segment BB is the shortest possible segment connecting the point B on the parabola to
0
the directrix. The segment BB must be vertical and perpendicular to the directrix for this to be
0
true. By construction, then, BB = F B < BD - a vertical line segment from B to the directrix
0
must be the same length as the line segment from B to F . Since BB is the shortest distance from
B to the directrix, it must be shorter than BD. If this is true, then P T can not pass through B,
or it would be closer to the directrix than the focus, a contradiction. Thus P is the only point of
intersection of the line P T and the parabola. Thus, P T must be tangent to the parabola at point
P.
Whew! Now, if P T is tangent to the parabola at P , the angles ∠F P T and ∠T P D must be
equal. Further, ∠T P D is equal to angle ∠D0 P T 0 . If we imagine D0 P to be a light ray incident on
a parabolic surface reflected toward F , this establishes that the incident and reflected angles are
equal. Since the point P was completely arbitrary, this means that any incident vertical ray must
be reflected through the focus F , and that any light originating at F will be reflected as a vertical
ray.
Other conic sections have reflective properties similar to the parabola. For instance, if a light
source is placed at one focus of an ellipse, the rays will converge onto the other focus after being
reflected. Any wave, including sound waves, may be substituted for light. A nice trick is to make
an elliptically-shaped room, known as a ‘whispering gallery.’ If a sound is created at one focus -
even a very quiet one - it will be heard clearly at the second focus. It is a dramatic demonstration.
You can stand at one focus and whisper so quietly someone standing next to you cannot hear, and
yet be clearly heard at the other focus. Some famous examples of rooms like this are listed in the
Wikipedia: http://en.wikipedia.org/wiki/Whispering gallery.
1. A concave makeup mirror has a focal length of 15 cm. If an object is placed 25 cm in front
of the mirror, determine the signs of the focal length, object distance, and image distance.
+, −, +
+, −, −
+, +, −
+, +, +
2. An inverted image of an object is viewed on a screen from the side facing a converging lens.
An opaque card is then introduced covering only the upper half of the lens. What happens to
the image on the screen?
1.04 m
3.78 m
0.52 m
2.08 m
11.6 Problems
1. While looking at her image in a cosmetic mirror, Dina notes that her face is highly magnified
when she is close to the mirror, but as she backs away from the mirror, her image first becomes
blurry, then disappears when she is about 38.0 cm from the mirror, and then inverts when she
is beyond 38.0 cm.
1. +, +, +.
3. 1.04 m.
1. The fact that the image chagnes from upright to inverted immediately tells us that Dina
has a concave spherical mirror. A convex mirror always gives an upright image, as does a flat
mirror. The point at which the image (briefly) disappears and inverts is the focal length, so
f = 38 cm. For spherical mirrors, we know that f = 2R, the radius of curvature is just twice as
big: R = 76 cm.
Figure 11.11 may jog your memory a bit. Right at the focal point, when the image goes from
upright and enlarged to inverted and enlarged, the image disappears.
L ENSES
(a)
R1 R2
principal
axis
338
12.1 Quick Questions 339
(a)
1. An object is placed to the left of a converging lens. Which of the following statements are
true and which are false?
12.2 Problems
The authors have developed a simple and inexpensive hands-on computerized tutorial aimed
at introducing beginning students to basic circuits and electrical properties. The system is
capable of a wide variety of electrical measurements, including V (I) and I(V ) characteristics,
voltage step function response (e.g., charging and discharging capacitors), and time-dependent
behavior using a low-frequency oscilloscope. The hardware is based on an inexpensive USB-
data acquisition device. Freely-available custom software developed by the authors provides
numerous experiment modules, and is designed to be highly extensible. The project can be
implemented for < $200 per seat, and has recently been successfully utilized in an introductory
general physics course at the University of Alabama.
Electronic devices have become ubiquitous in modern society. No matter how complex these de-
vices, the electrical properties of their component materials and the basic principles behind them
remain the same. It is becoming increasingly crucial that students have a detailed, hands-on under-
standing of the basic principles of electric circuits and the electronic properties of materials. More
importantly, the proper training of the next generation of scientists and engineers compels us, as
instructors, to introduce these concepts in a manner commensurate with what they will encounter
in advanced laboratory courses and research settings. This represents a significant challenge in
terms of overall cost and flexibility to address changing needs. In this article, we present an ex-
ample system we believe meets these criteria at a minimum of cost, the details of which we make
freely available. 33
Within many disciplines, students are exposed to the basic concepts of electronic devices and
electrical properties of materials. However, providing students with a modern, hands-on approach
to basic circuits and electrical property measurements resembling what they would find in a research
laboratory is often lacking. We believe this is due in no small part to the depth and breadth of
knowledge required and the significant cost involved in equipping a teaching laboratory with modern
data-acquisition-based software and hardware. Not only is this a major gap in students’ education,
it is a serious impediment in many cases to their introduction to laboratory or industrial research.
Far too often, teaching labs seem hopelessly out of date for those of us working daily in related
fields. Prior to this project, this was often the case for the authors, but cost alone prohibited a
commercial solution to the problem. Our goal with the present system is to give students at least
a glimpse of how electrical property measurements are performed in a modern research lab, at the
minimum of cost.
341
342 A.1 Introduction and Motivation
Many physics courses do currently employ a computerized hands-on approach to electronics and
electronic properties. Unfortunately, the cost can be prohibitive. Comparable systems to what we
present here can costs thousands of dollars per seat. Further, proprietary commercial systems are
rarely sufficiently open to extend or repair as instructional needs change – hardware is proprietary,
software is closed-source, and electronic devices change rapidly. Finally, many commercial systems
are not sufficiently transparent for students to grasp the inner-workings of the underlying software
and hardware – in the end, closed systems are in danger of being ‘black boxes’ to the students, hiding
many of the fundamental aspects of electrical property measurement. The present project aims to
provide a completely open and low-cost solution for students to perform experiments similarly to
how they are actually performed in research laboratories, and alleviate one barrier for promising
students to begin research.
In the hopes of addressing some of the issues outlined above, the authors have developed a simple
and inexpensive hands-on computerized tutorial aimed at introducing students to basic circuits and
electrical properties of materials. 33 Keeping hardware cost at a minimum, and freely distributing
software 34 will allow, we hope, rapid uptake of the system by others. The project provides a
complete hands-on system for modern data-aquisition-based electrical transport measurements, for
<$200/seat. The software to run the data acquisition hardware, laboratory procedures, complete
hardware schematics, assembly instructions – everything needed to build the hardware and install
the software – is freely available online. The hardware itself is based on the LabJack U3 data
acquisition device 35 ($90 with educational discount), augmented only by a few passive components
and a single op-amp. 36
The software allows control over sourcing and measuring current and voltage, time-dependent
behavior of RC circuits, and a simple oscilloscope. I(V ), V (I), and V (t) curves can be measured
and saved to simple ASCII files for post-analysis. The hardware is designed to be as inexpensive as
possible, transparent, and easily assembled; the software, easily installed and rapidly parsed. Cur-
rently, the prototype system is complete, and has been classroom tested (20 units for 48 students)
in Spring 2007 in an introductory physics course. 37
In the spirit of keeping the system as simple as transparent as possible, as well as working
within the hardware limitations inherent with our desire for minimal cost, we initially created a
list of working assumptions to guide the effort. First, for an introductory laboratory class, we
accept an accuracy of 5 − 10 % for teaching fundamental concepts. Second, components can be
pre-selected to avoid hardware and software limitations – components available to the students
will not fall outside the measurable range. Third, the system must be portable, simple, and easily
reproducible by colleagues without access to technical support. Fourth, minimal cost and maximal
simplicity override minor performance and accuracy gains. Finally, the system must be as far as
possible ‘student-proof’ – so long as no external hardware is interfaced, the system must not be
capable of destroying itself!
A.2 Hardware
A.2.1 LabJack U3
The heart of the system is the LabJack U3 USB-based data acquisition and control device. 35 The U3
provides 16 software-configurable “flexible I/O” (FIO) terminals, which can be configured as digital
input, digital output, and analog input, along with two timers, two counters, and four additional
digital I/O connections. When configured as analog inputs, the FIOs provide 12-bit resolution
(0− 2.4V single-ended, ±2.4 V differential). Analog input reads typically take 0.6−4.0 msec. One
dedicated 8-bit analog output (DAC0, 0−5 V) is available, with a second analog output (DAC1)
available depending on the software configuration. The U3 is USB driven and powered, requiring no
external supply connection. The primary advantages of the U3 from our point of view are extremely
low cost 38 ($90, with educational discount; volume discounts available), flexibility, and an fairly
open driver interface. The U3 has several limitations which must be taken into consideration,
however, which are relatively minor and easily worked around.
One primary limitation of the LabJack U3 is that the analog inputs are pseudo-bipolar – essen-
tially, one can only measure positive voltages. The 12-bit FIOs yield only ∼ 1 mV voltage resolution,
limiting accuracy on voltage and current measurements. The FIOs also have a rather low input
impedance (20 kΩ), making meter loading a potential problem. The refresh rate (20 msec) and out-
put frequency cutoff (3 dB at 16 Hz), to an extent restrict time-dependent measurements. Output
voltages are essentially limited to 3.6 V, and the fact that the U3 is USB-powered severely limits
overall current draw.
Given these limitations, we employed a number of ‘workarounds.’ In particular, the low refresh
rate, low input impedance, and output frequency cutoff require forethought in designing experi-
ments, particularly where circuit time constants play a role. The simplest and most effective is
carefully choosing the components for each laboratory ahead of time, such that the students will
not immediately be aware of many limitations. When limitations are discovered, they can be used
as an important pedagogical tool for further instruction. We have found that students readily
understand and accept hardware limitations, so long they can be explained.
Other limitations are also not so serious on further reflection. The lack of true bipolar I/O
requires manually reversing polarity and performing ‘positive’ and ‘negative’ measurements, which
in some cases can provide an instructional advantage. For example, measuring the forward I(V )
characteristic for a diode requires the student to carefully observe polarity, rather than being able
to rely on simply changing the software parameters. The input voltage limitation is circumvented
by the simple addition of a 2:1 voltage divider on voltage measuring inputs (see below), giving us
a measurement range sufficient for most experiments.
The first four FIOs are configured (in software) as differential analog inputs, which are used
for current and voltage measurements. The first analog output (DAC0) is used to drive a simple
voltage-current converter for current sourcing, and the second analog output (DAC1) provides
voltage sourcing. Figure A.1 shows the interfacing between the U3 and the I/O connections on the
student boxes.
Iin +
output, and current input connections between the Lab-
FIO 3 Jack U3 and the laboratory system. A 2:1 voltage divider
150
increases the voltage input range of the LabJack, current
measurements are performed with a resistive shunt. Volt-
FIO 2 Iin - age is sourced directly from the DAC.
DAC 1 Vout +
GND Vout -
Input voltage is measured between the FIO1 and FIO0 terminals (Fig. A.1). This gives a rather
limited input voltage range (0 − 3.6 V), and we therefore connected the voltage input terminals
“±Vin ” on the student box through a simple 2:1 voltage divider to extend the measurement range.
Naturally the U3 measures only voltages, necessitating the need for a current to voltage converter.
A simple resistive shunt is sufficient for this purpose, and current measurements are performed
with a shunt (150 Ω) between FIO3 and FIO2 analog inputs. So far as the students are concerned,
this acts like a classic ammeter - it must be in series with the load. Nominally, this gives us a
maximum measurable current of 24 mA (due to the maximum FIO input voltage), and a minimum
resolvable current change of 6 µA (due to the FIO resolution). The output current of the student
boxes is limited to ∼ 10 mA, and the output voltage to ∼ 5 V, thus for judicious choice of loads, the
0−24 mA input current limit does not present a serious obstacle. As mentioned above, we design
laboratory procedures with limitation in mind, and limit the selection of loads the students may
use to work within the hardware limitations.
Naturally, due to the rather large value of the shunt resistor, its non-negligible voltage drop
must be taken into account when doing, e.g. I(V ) characteristics. We take this as an opportunity
to introduce the students to a true four-point measurement (see Fig. A.5) and working around
non-ideal meters and sources. By recognizing the hardware limitations and making them explicit,
the students quickly learn to work within them and understand proper four-point measurements.
The voltage output simply uses the built-in analog output DAC1 referenced to ground (the first
output, DAC0, is used for current output, see below). This limits the voltage range to 3.6 V, which
again is adequate with careful component choice.
Sourcing current represented the most difficult challenge within the constraints decided upon. The
U3, unfortunately, is not capable on its own of driving sufficient currents, necessitating an additional
power supply. The primary factor above all others is minimum cost, which eliminates a great many
far more elegant solutions. Portability was another prime issue in addition to cost and simplicity.
Ideally, the system should not be tethered to a wall outlet, which precludes the use of separate ac
supplies to drive active elements.
+9V
(DAC0) +
+
- -9V Iout
-
R
(GND) Figure A.2: Voltage-current converter circuit. The re-
sistor R selects the ratio between the input voltage and
the output current. For portability, the op-amp is pow-
battery test DPST ered from 2-9 V batteries.
+9V
-9V
+ - + - battery test
9V 9V
For this reason, the programmable current output is a very simple battery-powered voltage-
to-current converter, driven by batteries, as shown in Fig. A.2. The voltage-current converter
essentially consists of one general-purpose op-amp 36 , and one programming resistor. The op-amp
itself is supplied with two 9 V batteries. We added a DPST switch to open-circuit the batteries
when not in use, and battery test points on the outside of the project box. Anecdotally, we did not
replace a single battery in the 20 units over the course of a semester.
The desired current level is programmed with the DAC0 output (0 − 3.6 V when using both
analog outputs) on the U3, referenced to ground. The single programming resistor governs the
ratio between input voltage and output current. In our case, Rprog = 310 Ω yields 3.2 mA/V in, for
a maximum of about 10 mA output. This circuit allows only unipolar output, but as the LabJack
itself is only pseudobipolar this is not an additional limitation.
Iout Iin
Vout V-I
converter Vin
battery
switch
battery test
A.3 Software
The software was developed entirely by one of the authors (P.L.), using the LabWindows/CVI
development package 43 from National Instruments. All of the software for this project, excepting
the LabJack driver and its interface, has been made freely available online 33 under the GNU General
Public License. 34 An emphasis has been made on simplicity of the user interface, and consistency
across the modules as much as possible. For example, all measurement modules present identical
parameter input fields and graphing capabilities as far as possible, and additional help is available
through ‘tooltips’ at any time by right-clicking on elements within windows. A simple ASCII
configuration file or a GUI interface within the software (“Settings” menu) allows field-tuning of
hardware and software behavior by instructors (e.g., calibration factors, altering I/O settings).
Most students appeared to find the software intuitive, with few questions regarding usage. A
usability study is underway to fine-tune the user interface, as is a comparative study with more
traditional approaches to the same laboratory procedures.
The simplicity and modularity is reflected in the underlying code as well – the infrastructure
Figure A.4: Left: A screenshot of the main application window. Right: A screenshot of the software, showing the ‘multimeter’
panel. Current or voltage can be sourced or measured in any combination, both source and measurement are updated in real
time. Active text areas at the bottom of the panel give the user instructions for the selected source and measurement.
allows new or modified experimental modules to be coded and implemented in a minimum amount
of time. Thus, as new ideas develop they can be quickly realized. Extensive feedback was solicited
from students at all levels, faculty, and a software usability expert. In order to facilitate uptake
of the system, a “demo mode” is automatically entered when no hardware is present. Potential
users can download the software, and explore the functionality of the system free of cost. All
software functionality is present to ‘test-drive’ the system, with actual measurements replaced by
randomly-generated numbers.
A.3.1 Multimeter
Different tutorial modules can be selected from the “main” applicaiton window, Fig. A.4. The first
software module the students typically encounter is a ’multimeter’ panel, Fig. A.4, chosen from the
‘dc circuits’ menu. From this panel, the student can source current or voltage, and simultaneously
measure current or voltage, in any combination. Currents from 0 − 10 mA can be sourced, and
measured from 0−20 mA, while voltages can be sourced from 0−3.5 V, and measured from 0−7 V.
The multimeter panel is not meant to mimic the behavior of a hand-held multimeter precisely.
Rather, its goal is to familiarize students with the basic practices of sourcing and measuring currents
and voltages. When the multimeter module is selected from the “dc circuits” menu (Fig. A.4),
the source and measurement regions of the module are blank until the user selects a source and
measurement function. The two text areas at the bottom of the window give various instructions,
which change as the user interacts with the module. Initially, they prompt the user to select a
source and measurement function. Once functionality has been chosen, the uppermost text area
relates to the chosen source (e.g., reminding the user that the current source also has a switch), the
lowermost to the chosen measurement. In the case shown in Fig. A.4, the user selected to source
voltage, and measure current.
Selecting voltage sourcing activates a dial, on/off button, indicator LED, and numerical readout
(middle left). The user can either dial in the current or type a number in the text box, and turn
the source on or off with the labeled buttons. The ‘LED’ turns green when the source is active, red
when it is off. The current can be changed in real-time (all sourcing and measuring is real-time,
with ∼ 50 msec update time).
Selecting current measurement activates a needle gauge, on/off button, indicator LED, and
numerical readout (middle right). The on/off button starts and stops the readout, the status of
which is also indicated by an ‘LED.’ Both the numerical readout and needle gauge read the current
in real-time while active. Though strictly speaking only unipolar measurements are performed, the
readout is bipolar to help the student troubleshoot incorrect wiring (polarity reversal). Further,
this panel allows a quick ‘field calibration’ – e.g., by connecting the current input to the current
output and electing to source and measure current. Similar behavior occurs when the user, e.g.,
sources voltage and measures current - the contents of the window reflect the chosen source and
measurement.
Usually this panel is used for the very first dc circuits experiment, which simply has the students
attempt to source and measure current and voltage for three types of components: resistors, diodes,
and capacitors. This short activity gives the students an introduction to key electrical components
and basic wiring concepts as well as an overview of the software they will utilize in later lab sessions.
Subsequently, the multimeter panel can be used to measure the equivalent resistance for series and
parallel resistors, and directly verify that the current and voltage, respectively, are the same for
both resistors.
The next level of complexity for the students is to perform current vs. voltage sweeps, I(V ),
or voltage vs. current sweeps, V (I). Both sweeps are supported, and the software functionality
is essentially identical. Both types of sweeps are provided for two reasons: first, for non-linear
components (e.g., diodes) the characteristics appear different to students at first sight, and secondly,
a true four-terminal measurement is qualitatively different in each case. We limit our discussion to
the I(V ) functionality below.
The I(V ) function is selected from the “dc circuits” menu on the main window (Fig. A.4 ). A
screenshot of this panel is shown in Fig. A.5. The “Control” region of the window (upper right)
asks the user to specify the start and end voltages, and how many steps to take in between. The
output is unipolar, and limited from 0−3.5 V as discussed above (the user will be coerced if values
outside this range are specified). Sweeps can run “up” or “down” as desired.
Vin + Vin -
V
device
under test Iin +
I
Iin -
Vout + Vout -
Figure A.5: Left: A screenshot of the software, showing a measurement of a 1350 Ω resistor using the current vs. voltage
module. The maximum on the voltage axis does not correspond to the maximum sweep voltage, due to the finite resistance of
the ‘ammeter.’ Right: Schematic of a four-terminal I(V ) measurement.
Once the desired values are chosen, clicking the blue “Start I(V)” button performs the mea-
surement and updates the graph in real time. The ’LED’ in the upper right corner when the
measurement is active. At any time, the red “Halt” button can be pressed to immediately stop
the measurement. After taking data, the two red crosshairs can be used to select a region of data,
and the ”zoom” button will rescale the plot to show only that region. “Restore” will auto-scale the
plot to its original state. Data will remain on the plot (and in memory) until the “Clear” button
is pressed. The small white crosshair is a plot read-out, and when dragged to a data point, the
“Data display” in the lower right will indicate the current, voltage, and resistance (R = VI ) at that
data point. A text field in the lower left gives interactive status and instructions, and, as in any
panel, ‘tooltips’ are available by right-clicking on objects within the panel.
Measuring I(V ) characteristics quickly introduces the students to proper four-terminal mea-
surements and the effects of, e.g., finite wire and ammeter resistances. In order to perform a
four-terminal measurement (Fig. A.5) the student must connect the current measurement in series
with the load. Further, they must simultaneously measure the actual voltage drop on the device
under test, as a non-negligible voltage drop occurs across the 150 Ω shunt resistor in our ‘ammeter.’
In any of the measurement panels (excepting the multimeter panel), any data currently on screen
can be saved through the “File” menu. If multiple curves are taken without clearing the plot, all
data currently on-screen is saved to the data file, not just the most recent data. The data itself is
saved in a tab-delimited ASCII file, with a one-line text header for column labels, readily imported
by e.g., Excel or OriginLab.
In one example, students performed V (I) measurements for series and parallel resistor com-
binations, and performed linear regression to find the effective resistance. Once simple resistor
Figure A.6: A module for observing voltage step responses. Applying stepped voltages (i.e., either suddenly turning a voltage
on or off ) allows real-time observation of charging and discharging behavior. Two successive measurements are shown on
screen, one measuring the response of a series RC circuit, and one measuring the stepped output alone for comparison.
circuits have been mastered, students can be introduced to non-linear elements (such as diodes).
In particular, light-emitting diodes are an excellent. Not only is non-linear behavior observed (and
thus I(V ) and V (I) on first sight appear to be different), students can clearly see the device light
up only for a single voltage polarity. This also allows a brief introduction to semiconductor physics
and non-linear regression. For example, comparing the threshold voltage for light-emitting diodes
of various colors to the output wavelength (as measured with a diffraction grating), students were
able to estimate Planck’s constant to within ∼ 10 %. 44
The next level of complexity is mastering RC circuits and time-dependent phenomena. Figure A.6
shows a module which allows characterization of step function responses. In one case, the student
observes the response of a circuit to a downward voltage step (from V to 0), in the other case, the
response to an upward step is observed (from 0 to V ). This lets students monitor in real-time the
charging and discharging of a parallel RC circuit, for example.
Measuring the discharge curve of an RC circuit, for example, a user-selected constant voltage
is applied for a specified wait time, and subsequently (at t = 0 on the plot) the voltage is reduced
to zero. This functionality is selected by choosing “initially” from the “Apply Voltage” drop-down
menu. The actual voltage on the resistor or capacitor can be monitored during this time, and the
RC time constant directly observed. Again, data can be easily exported for subsequent analysis, a
good introduction to exponential and logarithmic behavior.
The charging curve can be just as easily measured by selecting “After Delay” from the pull-down
menu. In this case, the measurement proceeds for the specified time with zero voltage, and at t = 0
the specified voltage is applied. This allows one to observe the corresponding charging curve for an
RC circuit, which is shown in Fig. A.6. Also shown is the step response itself, measured by simply
connecting voltage output to input. Given the 20 kΩ input impedance of the LabJack FIOs, and
the time resolution of ∼ 10 msec, components giving a fairly large time constant (& 0.1 sec) should
be selected.
A.3.4 Oscilloscope
The last module completed at the moment mimics the behavior of an oscilloscope and function
generator. A variety of waveforms can be applied and plotted in real time, and the response of a
circuit element plotted at the same time. The waveforms are generated in software, and limited by
the refresh rate of the U3 DAC and its output filters. A dc offset voltage is necessary to keep the
output voltage at or above ground at all times, due to the pseudobipolar nature of the LabJack U3.
This offset is automatically included without user intervention. Square, triangle, sawtooth, and
sinusoidal waveforms are currently supported and generated in software. In principle, arbitrarily
complex waveforms can be added in software as needed. Figure A.7 shows the result of applying
a sinusoidal voltage to a resistive circuit, and Fig. A.8 to a parallel RC circuit. This allows direct
observation of the phase-shifted response of a capacitor in an RC circuit, and an introduction to the
frequency-domain description of ac circuits. In more advanced classes, one can use, e.g., triangular
or square waveforms to illustrate integrating and differentiating RC circuits, or the effects of ‘stray
capacitance.’
Peak to peak amplitude can be specified, up to 4 V, and a ‘Response Gain’ can be used to
magnify the on-screen signal for easier comparison (only ‘raw’ data is saved). Triggering is done in
software after a specified number of cycles – the default is to trigger every 3 cycles, but this value is
user configurable in the ‘Settings’ menu. Output frequency is limited primarily by the output filters
on the LabJack DAC outputs (3 dB at 16 Hz), in practice . 5 Hz provides reasonable waveforms.
Once again, by careful choice of components, this rather low frequency limitation need not be a
problem. Induced voltages due to time-varying magnetic fields, and RLC resonant behavior are
rather inaccessible given the available frequency range, however.
Figure A.7: A module for learning the time-dependent behavior of circuits. Students can apply a low-frequency waveform
to a circuit (sine, square, triangle, or sawtooth), and observe the amplitude and phase behavior of the response. The driving
sinusoid and the response are measured simultaneously.
A.4 Outlook
In the inaugural semester, seven roughly hour-long laboratory procedures were developed by P.L.
for a non-calculus-based introductory physics course: a ‘components’ lab, introducing students to
voltage and current sourcing with resistors, capacitors, and diodes; a ‘resistors’ lab involving the
measurement of I(V ) characteristics for series and parallel resistors to find equivalent resistances;
a ‘sourcing’ lab comparing true four-terminal resistance measurements with voltage and current
sourcing (I(V ) and V (I) characteristics of resistors and diodes); an ‘rc circuits’ lab, measuring
the charging/discharging characteristics of RC circuits to find time constants; an ‘ac circuits’ lab,
measuring the phase-shifted response of a capacitor in real-time; and a procedure to measure
Planck’s constant using LEDs of various colors (described above 44 ).
Future modules in development include magnetic field mapping using a simple and inexpensive
Giant Magnetoresistive field sensor 45 powered and measured with the existing hardware, basic
optoelectronics using photocells and light-emitting diodes, and temperature-dependent resistivity
using liquid nitrogen and a thermocouple temperature sensor. We believe that the flexibility of
the hardware and modularity of the software will allow a wide variety of additional modules to be
developed.
We do not envision that the present apparatus will completely replace traditional experiments
Figure A.8: Left: Measurement of the voltage on a 2200 µF capacitor in parallel with a 200 Ω resistor driven with a sinusoid.
Right: Current-voltage characteristics for red, green, and yellow LEDs. Students compared the threshold voltage with measured
output wavelengths to estimate Planck’s constant.
using, e.g., function generators and oscilloscopes – these are instruments with which students must
be familiarized. In particular, higher frequency phenomena and resonance behavior are essen-
tially inaccessible with the current hardware, as are most induction phenomena. We envision the
present system as a way to augment existing courses and add flexibility, particularly in introduc-
tory electricity and magnetism courses where simple transparent experiments are often lacking. In
particular, we hope this system can be a viable alternative where laboratory budgets are severely
limited, or where funds could be better spent on other laboratory endeavors. Further, we have
made every effort to make the system as open as possible, such that even more industrious students
could reproduce the hardware and software if they desire. The authors are happy to assist anyone
interested in implementing the system outlined here.
A.5 Acknowledgments
The authors gratefully acknowledge financial support from the College of Arts and Sciences at UA
through an experimental teaching grant, and NSF MRSEC grant No. DMR 0213985 for additional
support. We thank T. Hayward for assistance with software usability, and C. LeClair for assistance
in preparing the manuscript. We also thank J.W. Harrell, S.T. Jones, T. Mewes, and R. Schad for
helpful discussions, and E. Stough, A. Patterson, J. Reichwein, R. Sun, D. Genkina, and S. O’Neal
for preliminary testing of the software and hardware.
Gauss’ laws:
the electric flux ΦE through any closed surface is equal to the net charge inside the
surface, Qinside , divided by 0 :
Qinside
ΦE,closed surface = (B.1)
0
A mathematical proof of the boundary conditions on E ~ and B~ implied by Gauss’ laws (and a
few other things) are beyond the scope of this course. We can still use the results, however.
354
B.1 Boundary conditions for electric fields 355
~ through a surface:
Normal components of E
The difference in the normal (perpendicular) components of the electric field when crossing a
surface gives the surface charge density - the number of charges per unit area on the surface between
the materials. What this implies is that if we have two different materials stuck together, and apply
an electric field perpendicular to the interface between the materials, we will build up a charge on
the interface. This effect is crucial in semiconductor electronics, for example.
What about the components of the electric field parallel (tangential) to the surface? This is a
bit more complex, but still useful. First, the result:
~ through a surface:
Tangential components of E
(2) (1)
Ek − Ek = 0 (B.4)
The parallel components of E ~ are conserved (do not change) when we cross the boundary. As
it turns out, this is just another expression of the fact that single magnetic charges do not exist,
though proving that is beyond our present discussion. It is also an expression of the fact that E ~
~ fields do.
fields tend not to circulate in the way that B
What do the parallel components of E ~ really mean? If the difference in parallel components of
~ were not zero, what does that mean? It means that if we draw an arbitrary closed path on our
E
surface, and add up the tangential components of E ~ at every point on the path, we would get a
non-zero answer, and have a net rotation of E. ~ A net rotation of E ~ would mean that around a
closed loop, we would have a net electric field. If that is true, then the work done in going around
the closed path would not be zero! We know that cannot be true for electric fields. If this is true for
an arbitrary closed path, it is true all over the surface. So the difference in tangential components
on a surface must be zero for E.~
~ were not zero, it would mean electric field lines would close back
Also: if the circulation of E
on themselves - and we know this is not true either. What is important, though, is that we can
determine the change in electric field components when we move from one material to another.
Take a charged conducting sphere of radius r, with a total charge Q sitting in vacuum. Since
it is conducting, all the charge resides on the surface. The surface area of a sphere is 4πr2 , so
σsurface = Q/(4πr2 ). From this we know that the difference in the components of the electric field
(2) Q
E⊥ = (B.6)
40 πr2
What we have now shown, from boundary conditions alone we have proven that a charged sphere
looks just like a point charge when viewed from outside the sphere!
We also know that the electric field is always oriented perpendicularly to the surface of a conduc-
tor - the parallel component at any point on the surface is zero. Therefore the sum of the parallel
components is zero as well, and the field lines must be radially symmetric.
~ through a surface:
Normal components of B
(2) (1)
B⊥ − B⊥ = 0 (B.7)
So the normal components of B ~ through a surface are conserved, i.e., the perpendicular com-
ponent of B ~ cannot change when passing from one material to another. This is a good thing, or
electromagnets (Section 7.4.2) wouldn’t work at all! We wouldn’t get any gain in magnetic field by
using a high permeability core material if this were not true.
The parallel components of B ~ are related to the current on the surface. If the difference in
parallel components of B ~ is not zero, what does that mean? It means that if we draw a circle
on the surface, and add up the tangential components of B ~ at every point on the circle, we get a
~
non-zero answer, and have a net rotation of B.
What do the parallel components of B ~ really mean? Unlike the case of electric fields, the
~ are not zero. That means that if we draw an arbitrary closed
difference in parallel components of B
path on our surface, and add up the tangential components at every point, we do get a non-zero
answer. In other words, B~ has a tendency to rotate or circulate on a surface. Since that is true,
then the work done in going around the closed path is not zero! Magnetic forces are fundamentally
nonconservative, unlike electric or gravitational forces. This is a bad thing for generators, motors,
or transformers, which all operate in some way by cycling a magnetic field back and forth!
We already know that there is a tendency for “circulation” of B ~ whenever we have a current
present - the magnetic field from a long, straight wire circulates around the wire, for example. As it
turns out, the tangential components of B ~ are just related to the current flowing across the surface
per unit area:
~ through a surface:
Tangential components of B
[1] P. J. Mohr and B. N. Taylor, “CODATA recommended values of the fundamental physical
constants: 2002,” Rev. Mod. Phys., vol. 77, pp. 1–107, 2002.
[2] http://physics.nist.gov/cuu/Units/index.html.
[7] A. Einstein, The meaning of relativity. Princeton, New Jersey: Princeton University Press,
5 ed., 1988.
[11] http://bama.ua.edu/∼lclavell/pages/.
[12] http://bama.ua.edu/∼jharrell/PH106-S06/vandegraaff.htm.
[13] R. J. van de Graaff, “Electrostatic Generator.” Patent 1,991,236, 12 February, 1935. Filed
16 December, 1931. Patents are published as part of the terms of granting the patent to the
inventor. Subject to limited exceptions reflected in 37 CFR 1.71(d) & (e) and 1.84(s), the text
and drawings of a patent are typically not subject to copyright restrictions. In this case, no
copyright reservations were stated.
358
BIBLIOGRAPHY 359
[14] K. T. Compton, L. C. V. Atta, and R. J. V. de Graaff, “Progress report on the M.I.T. high-
voltage generator at Round Hill (typescript),” MIT Office of the President Records, vol. box
187, folder 5, ‘Round Hill, 1932-1933’, 1930-1959.
[15] This photograph, from http://flickr.com, is licensed under the Creative Commons
Attribution-NonCommercial-NoDerivs 2.0 license. It is the work of Tracy Lee Carroll (user
StarrGazr on flickr.com). See http://creativecommons.org/licenses/by-nc-nd/3.0/ for
license details.
[17] Crystal lattice images created with Jmol: an open-source Java viewer for chemical structures
in 3D. http://www.jmol.org/.
[18] This photograph, from http://flickr.com, is licensed under the Creative Commons
Attribution-NonCommercial-NoDerivs 2.0 license. It is the work of user ‘germanium’ on
flickr.com. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for license details.
[21] C. Kittel, Introduction to Solid State Physics. New York: John Wiley and Sons, Inc., 7 ed.,
1996.
[22] L. Solymar and D. Walsh, Lectures on the Electrical Properties of Materials. Oxford: Oxford
Science Publications, 4 ed., 1990.
[23] http://en.wikipedia.org/wiki/Resistivity.
[27] Image in the public domain. From Practical Physics, publ. 1914 (Macmillan and Company).
[28] http://en.wikipedia.org/wiki/Pseudovector.
[31] Image from L. Keiner, http://www.keiner.us/. This image is licensed under the Creative
Commons Attribution ShareAlike License v. 2.5 (http://creativecommons.org/licenses/
by-sa/2.5/). You may use this image if attribution is given. Please notify the author of your
use.
[33] The project software, hardware schematics, installation instructions, sample laboratory pro-
cedures utilizing the system, and many other details can be found at the project home-page,
listed above. Binary downloads of the software package are also available.
[35] See http://www.labjack.com for information including pricing, software and documentation.
The LabJack U3 is currently listed at $99, or $90 after educational discount.
[36] We used the NTE976 which we had in stock, $8.93 from http://www.mouser.com. Many
cheaper substitutes are available.
[38] A PASCO Xplorer GLX configured with voltage/current sensor is another low-cost equiva-
lent. It provides a portion of the functionality of the current system at ∼$400 per seat. The
proprietary software and hardware represents, in our view, a lack of flexibility however. The
National Instruments USB-6008 device compares quite favorably to the LabJack U3, and could
be readily substituted. It does have a slightly higher cost (∼$60 more).
[40] Adhesive-backed velcro is usually available at any local fabric store, e.g., http://www.
hancockfabrics.com/.
[41] See, for example, McMaster-Carr p/n 7124K42, $12.14 per package of 10.
[42] The smaller size prevents students from plugging sensitive test loads directly into 18 V from
the batteries.
[43] See http://www.ni.com/lwcvi/ for details. Academic site-licenses available. An effort is un-
derway to make the software build-able with a free cross-platform development environment.
[45] Non-Volatile Electronics (NVE) offers a number of low magnetic field sensors under $10. See
http://www.nve.com/analogSensors.php.