PSOanjan Gupta
PSOanjan Gupta
PSOanjan Gupta
1 The beginning of QM
I like to start this course with a remark by the famous Nobel laureate physicist Albert
Michelson (or Michelson-Morley experiment) to an audience in 1894. He said, “It seems
probable that most of the ground underlying principles have been firmly established and that
further advances are to be sought chiefly in the rigorous applications of these principles
to all phenomena which come under our notice..... The future truths of physics are to be
looked for in the sixth place of decimals.”
This was a very inappropriate time for this statement as in 1894 itself Becquerel dis-
covered radioactivity followed by X-rays discovery by Röntzen in 1896. In 1897, Thomson
discovered electrons, which changed the course of developments with the immediate ques-
tion like how a neutral atom could form. The plum pudding model of Thomson was
proved wrong by Rutherford by scattering α-particles from gold foils leading to a nuclear
model with the electrons bound to a positive nucleus by Coulomb force and orbiting like
planets around Sun. But this was problematic as an accelerating charge particle radiates
which raised questions about stability of an atom. Bohr proposed a model that could
explain the spectral lines of hydrogen. Some of these excitements can be lived through
by reading a first 5 chapters of a book, ”Second Creation” by Crease and Mann. One can
easily google to find the chronology of the early developments in quantum mechanics that
began in the last decade of 19th century.
The experiments and theories that eventually led to various aspects of quantum me-
chanics can be broadly divided into two categories, namely ”particle nature of waves”
and ”wave nature of particles”. We review some of the experiments below to inspire an
unease on wave-particle duality.
1
1.1.1 Black body radiation
Heated objects emit black body radiation with a characteristic spectrum determined by
temperature. A black body is an object that absorbs all the em-radiation falling on it.
This is an idealization of practical situations and such ideal Black body spectrum can
be theoretically explained. The spectrum is defined by I(λ)dλ which represents the em-
energy emitted in wavelength window λ to λ+dλ or by ρ(ν)dν which is the energy emitted
in frequency range ν to ν + dν.
A blackbody can be constructed by using a highly absorbing (non-transmitting) walls.
Further, one can geometry confine all em-radiation by collecting all the scattered light from
walls to within the cavity. One possible construction is as shown. This is an idealization
and in reality if the cavity is made of a particular material with a characteristic emission
spectrum, the radiation inside will not have typical blackbody characteristics. We, here,
are not concerned with such non-ideal situations.
Figure 1: The blackbody spectrum, i.e. energy density Vs λ, at two different temperatures.
The peak wavelengths are marked by λm1 and λm2
This is known as Wien’s displacement law. For Sun λm = 5100 Å leading to T = 5900◦ C.
In fact Wien had already argued about the general features of the blackbody spectrum
using thermodynamics arguments. Another known law, again from thermodynamics ar-
guments, was Stefan-Boltzmann law of radiation, i.e. the power emitted by a black body
per unit surface area is given by
u = σT 4 (2)
with σ = 5.67 × 10−8 W/m2 -K4 . Also, before Planck two attempts were made to un-
derstand blackbody spectrum which were only partially successful. One was a theory by
Rayleigh-Jeans and another by Wien.
With this as background let us discuss how we understand the blackbody radiation
now. In order to find the energy density due to em-waves inside a cavity in thermal
2
equilibrium at temperature T we need to find two things: 1) Number of em-modes in
frequency range ν to dν, i.e N (ν)dν and 2) Energy contained in each mode per unit
volume at temperature T . The net energy per unit volume in frequency range ν to dν
will be the product of the two, i.e.
So our first task is to find N (ν)dν. For this we consider stationary em-waves in a cube
shaped cavity. It turns out that the mode density is actually independent of the exact
shape for macroscopic objects.
Figure 2: Left: The stationary modes of a 1D cavity of length L in real space. Right:
The discrete allowed mode in k-space
Before analyzing a three-dimensional cubic cavity let’s first discuss the plane stationary
waves in a one dimension confined between x = 0 and L. The waves will exhibit nodes at
the two ends and thus the wavelength of permitted modes will be given by λn = 2L/n,
see figure. Thus the associated wave-number is given by kn = 2π/λn = nπ/L. Hence
the separation between the wave-numbers of the neighboring modes ∆k = π/L. We also
have νn λn = c and so νn = ckn /2π and ∆ν = cπ/2L. Since L is a large length scale and
so these modes will be extremely close to each other. Thus number of modes in a small
dk-interval will be given by 2 × dk/(π/L). The factor of two here is due to two different
polarizations of em-wave. In 2D one gets kx = nx π/L and ky = ny π/L with nx , ny as
positive non-zero integers. If we look at the modes in a 2D square shaped cavity kx , ky
plane we get a square-grid of discrete points with separation ∆kx = ∆ky = π/L. The
number of modes between k and k + dk can be found by finding the number of discrete
points in one-quarter of a circular shell of radius k and thickness dk. This shell will have
an area 2πkdk/4. Each discrete point in 2D k-space occupies an effective area (π/L)2 ,
giving number of discrete points as (πkdk/2)/(π/L)2 .
In 3D cubic cavity, one gets kx = nx π/L, ky = ny π/L and kz = nz π/L giving a
3D grid of points in 3D k-space with each point effectively occupying a volume (π/L)3 .
−
→ p
For a given point the magnitude of wave-vector would be | k | = kx2 + ky2 + kz2 which
determines the overall frequency and wavelength of this particular mode. To find the
number of total modes in an a volume of k-space between k and k = k + dk, this volume
is given by one octant of a spherical shell of radius k and thickness dk, i.e. 4πk 2 dk/8.
With the volume of each discreet k-point as (π/L)3 , we get the number discreet k-points as
(πk 2 dk/2)/(π/L)3 = L3 k 2 dk/2π 2 . Now counting the independent polarization that each
discrete k-point exhibits we get to number of modes per unit volume as N (k)dk = k 2 dk/π 2 .
3
Figure 3: The discrete allowed mode in k-space for a 2D square cavity of side L. The ring
like region of width dk and radius k encompasses good number of such discrete points
that can be found from (2πkdk/4)/(π/L)2 .
4
where ϕ(x) is some unknown function of x. In fact people thinking of alternate theories
to classical thermodynamics believed in Boltzmann theory very strongly and they did
not want to sacrifice it. Wien further argued that the frequency of emitted em-radiation
from molecules must be proportional to their kinetic energy, i.e. mv 2 /2 = aν and so the
intensity of radiation should be proportional to the number of molecules at that kinetic
energy, i.e. exp(−aν/kB T ). Thus he proposed ρ(ν) = (Aν 3 /c4 ) exp(−bν/kB T ) with A, b, c
as constants. Or one can also write ρ(ν) = (A/λ5 ) exp(−b/λT ). This function shows a
maxima at some λm as seen in experiments. On maximizing this leads leads to λm T = b/5.
This gave a reasonable agreement with the experiments with Wien’s constant b found from
experiments. Wien got Nobel prize in 1911 for his contribution to the understanding of
the black body radiation.
Planck’s theory of black body radiation: Planck proposed that em-radiation
comes in quanta and the E and B fields cannot take arbitrary values. It turns out that
the quantum theory of em-radiation is much more complex and it came much later as
quantum electrodynamics (QED) which quantized the em-fields. This is way beyond the
scope of this course.
Planck argued that for a given frequency-ν mode the energy can take values 0, hν, 2hν,
3hν, ...., i.e. nhν. The relative probability to have one quantum of energy is exp(−hν/kT )
and for two quanta it is exp(−2hν/kT ) and for n-quanta it is exp(−nhν/kT ). So total
energy associated with a frequency ν mode will be
P∞
(nhν) exp(−nhν/kT )
P∞
n=0
(5)
n=0 exp(−nhν/kT )
P
. The series’ inPthe numerator and denominator can be easily summed using ∞ n
n=0 x =
∞
1/(1 − x) and n=0 nx = x/(1 − x) . The first is a geometric series and the second one
n 2
can be obtained by differentiating the first result wrt x. Here x = exp(−hν/kT ). Thus
the final average energy in ν-frequency mode works out as hν/[exp(hν/kT ) − 1]. This
leads to the Planck’s law of black body radiation as:
8πhν 3 1
ρ(ν)dν = dν. (6)
c exp(hν/kT ) − 1
3
One can easily see that in the limit of large frequencies, i.e. hν ≫ kB T one gets the
Wien’s law from Planck’s expression and in the opposite limit hν ≪ kB T , it agrees with
Rayleigh-Jeans law. Also one gets the precise Wien’s displacement law from Planck’s
expression (homework) as hc/λm kB T = 4.967. From the then known Wien’s constant
one could easily find the value of h/kB . Another known law at that time was Stefan’s
law i.e., energy radiated from a black body surface per unit area and per unit time is
σT 4 with experimentally known σ value of 5.67×10−8 Wm−2 K−4 . The Stefan’s law can
be easily derived from Planck’s law giving σ = 2π 5 kB
4
/15c2 h3 . This helped Planck deduce
h = 6.55 × 10−34 J-s and kB = 1.346 × 10−23 J/K. This did not help much as none of these
were known independently by any other method. But the gas constant R was known
5
which helped Planck deduce Avogadro number NA = R/kB as 6.175×102 3 per mole. Also
Faraday constant, i.e. charge in one mole of singly charged ions, F was known so Planck
could find e = F/NA . This prediction was shortly confirmed by Geiger and Rutherford
by finding charge on α-particles. The agreement between the two was within about 1%.
The explanation for this behaviour was given by Einstein in 1905 for which he got
Nobel prize. Einstein proposed that the electrons absorb only a quantum of energy from
incident light given by hν and come out of the metal as free electrons with kinetic energy
T = hν − ϕ where ϕ is the work function energy barrier which protects the electrons
from coming out of the metal. The metal has a continuous distribution of electrons below
certain energy, called Fermi energy EF , which is energy ϕ below the vacuum energy. Thus
one gets electrons with kinetic energy ranging from zero (i.e. for electrons excited from
hν − ϕ below EF ) and up to kinetic energy hν − ϕ (i.e. for electrons excited right from
EF ). As a result with increasing V less electrons will make it to the other electrode
and above eVth = hν − ϕ, no electron will make it to the second electrode. Thus we get,
Vth = (h/e)ν −ϕ/e. This linear relation between Vth and ν was found to be experimentally
correct with its slope well described by h/e. This could also help in deducing the work
function of the metal. The critical part of the argument was the absorption of light
happens only in one quantum hν of energy from light wave indicating the quantization of
energy associated with light wave.
Another interesting consequence of this experiment is the technique of ”photo-electron
spectroscopy” which helps us probe the number of electronic states available in a conductor
at different energies below the Fermi energy. This has been a very successful probe as it
6
Figure 5: Photoemission spectroscopy.
has helped us understand the physics of the electrons in many solids. In the modern photo-
electron spectroscopy experiments one uses highly monochromatic light (from Lasers or
more sophisticated sources) with very sophisticated electron energy analyzers. This helps
one deduce the distribution of electronic states in solids with great energy resolution. In
fact, one can also get momentum information of the electron with detectors capable of
fine angular resolution.
1.1.3 X-rays
X-rays were discovered by Röntzen in 1895 by a process called Bremstrahlung, i.e. sudden
deceleration of electrons in a metal leadin to em-radiation. This is actually an inverse pho-
toelectric effect as high-energy electrons impinging on a metal electrode lead to emission
of X-rays. The X-ray emission is related to atomic transitions. There is one interesting
fact that had consequences for the development of quantum mechanics. This was about
the detailed distribution of X-ray intensity as a function of wavelength, λ. It was found
that emitted X-ray intensity vanished below certain wavelength λth . Further, this λth
was dependent on the kinetic energy of the impinging electrons or rather the voltage V
through which the electrons were accelerated before striking the metal electrode. Thus
λth was found to depend on V as,
12.4Å
λth = . (8)
V (in kV)
This behavior can be readily understood as the maximum energy that an X-ray photon
can carry will be limited by the electron’s kinetic energy. The photon energy can always be
7
less as the electron can loose energy to other non-radiative processes. Thus the maximum
photon energy hc/λth = eV or
hc 6.6 × 10−34 × 3 × 108 12.4 × 10−10 12.4
λth = = m = m= Å. (9)
eV 1.6 × 10 −19 × V (in kV) × 103 V (in kV) V (in kV)
This agress with experiments nicely. X-rays have any applications. The two noteworthy
are in medical diagnostics and in finding the crystal structure analysis. In fact the above
relation is useful to remember as a conversion formula between energy (in keV) and
wavelength (in Å).
Figure 7: The left schematic shows the X-ray photon and electron before scattering,
the middle one shows them after scattering. The right schematic is the experimental
setup showing the incident X-ray beam and the Compton scattered beam analyzed by the
detector at scattering angle θ.
One needs to write the energy and momentum conservation for a 2D scattering problem
as shown in figure. For electron initially at rest, the energy conservation gives:
q
hν + m0 c = p2 c2 + m20 c4 + hν ′
2
(10)
The momentum conservation along the direction of initial photon momentum gives
h h
= ′ cos θ + p cos ϕ. (11)
λ λ
The momentum conservation in the direction perpendicular to the initial photon momen-
tum gives
h
0=− sin θ + p sin ϕ. (12)
λ′
8
We thus get only three equations but there are four unknowns, namely p, ϕ, λ′ and θ.
We can eliminate two unknowns and find a relation between the remaining two. This
is dictated by experimental setup where one measured the scattered X-ray spectrum at
different scattering angles. Thus we eliminate the electron variables p and ϕ to find a
relation between λ′ and θ. From last two equations, we get
2
h2 h h
p cos ϕ + p sin ϕ = ′2 sin θ + − ′ cos θ + −
2 2 2 2 2
λ λ λ
2 2 2
h h 2h
or p2 = ′2 + 2 − ′ cos θ
λ λ λλ
or p c =h ν + h ν − 2h2 νν ′ cos θ after using λ = c/ν.
2 2 2 2 2 ′2
Here λC = h/m0 c is called the Compton wavelength of electron whose value is 0.024Å.
Thus we see that the change in wavelength of the X-rays is rather small. In order to
resolve this change beyond the linewidth of the used X-rays one has to use X-rays of very
small wavelength or high energy. In actual experiment the X-ray beam strikes a metal foil
having free electrons as well as atoms. One measures the X-ray spectrum, i.e. intensity as
function of wavelength, at different scattering angle θ values. One sees two well separated
peaks in a typical spectrum, one elastic (i.e. at λ) and the other inelastic, at λ′ as
dictated by Compton formula. The elastic peak arises from the Compton scattering from
atoms which are much heavier than electrons and thus lead to a much smaller Compton
wavelength and thus there is negligible shift in λ of the X-rays that Compton scatter from
atoms.
9
1.2 Specific-heat of solids due to lattice
Solids get contribution to their specific heat from many internal degrees of freedom includ-
ing lattice vibrations and electrons (in metals). For insulators the specific heat is dictated
primarily by lattice contribution which we discuss here as it also played some role in the
development of quantum mechanics. There was certain understanding of the specific heat
which was derived from the classical equipartition theorem. For a solid containing N
atoms in a 3 dimensional solid there are 6 bulk degrees of freedom (dof ) that lead to bulk
translational and rotational kinetic energies arising from three linear momenta and three
angular momenta. The remaining 3N − 6 dof go in dictating these many lattice vibration
modes. An example in 3D would be taking a triangular molecule with three atoms at its
corner. It’ll exhibit 3 independent vibration modes.
Each vibration mode will have two forms of energy associated with it, namely kinetic
energy associated with the mode-momentum (p) and elastic potential energy associated
with the mode-displacement (x) as is the case with a single oscillator of fixed frequency.
The energy would be quadratic wrt both i.e. E = p2 /2m + kx2 /2. Thus from classical
equipartition theorem one get kB T /2 from x and p dof associated with each mode leading
to per mode average energy at temperature T as kB T . This leads to an average total
energy ⟨U ⟩ = (3N − 6)kB T and thus the molar specific heat as c = ∂⟨U ⟩/∂T = 3NA kB =
3R. Here R is the gas constant. This temperature independent classical expectation
is known as Dulong-Petit law. But this does not agree with experiments as the low
temperature specific heat of insulating solids is much smaller that 3R and it vanishes as
T → 0. Einstein proposed a quantum theory asserting that the lattice vibration energy
is actually quantized. Thus a frequency-ν vibration mode will take energy in quanta of
hν just like photons. In fact these a quantum of lattice vibration energy is analogously
called phonon. This is indeed the case as we shall see from the discussion of quantum
simple harmonic oscillator. Thus the relative probability of a mode exhibiting energy nhν
would be exp (−nhν/kB T ) and eventually just like the blackbody radiation discussion the
average energy associated with a mode at temperature T is given by
P∞
n=0 nhν exp (−nhν/kB T ) hν
⟨u⟩ = P ∞ = (15)
n=0 exp (−nhν/kB T ) exp (hν/kB T ) − 1
Einstein assumed all the 3N − 6 lattice modes to be of same frequency ν and thus the
total energy of all the modes will be (3N − 6)⟨u⟩ and so the molar specific-heat of the
10
solid will be given by (ignoring 6 relative to NA ),
∂⟨u⟩ −3NA hν exp (hν/kB T ) hν
c =3NA = −
∂T [exp (hν/kB T ) − 1]2 kB T 2
2
hν exp (hν/kB T )
or c =3R
kB T [exp (hν/kB T ) − 1]2
2
TE exp (TE /T )
or c =3R . (16)
T [exp (TE /T ) − 1]2
Here TE = hν/kB is a material dependent fitting parameter called Einstein temperature.
We can easily check the result in the high and low temperature limits. This showed a
much better agreement with the experiments. For T ≫ TE , one gets c = 3R, consistent
with classical Dulong-Petit law. For T ≪ TE one gets c = 3R(TE /T )2 exp (−TE /T ). So c
does approach zero as T → 0. However the exp (−TE /T ) dependence is not what one sees
at low temperatures experimentally. One rather finds that c ∝ T 3 at low temperature.
Debye gave a more satisfactory theory incorporating the fact that all modes do not have
same frequency but rather one gets a systematic frequency dependent density of modes
from the dispersion of lattice vibrations in real solids.
Figure 10: Frank-hertz experiment schematic and voltage dependent plate current.
This experiment done in 1914 demonstrated that internal energy of an atom are quan-
tized in a way that the atoms can loose energy in inelastic collisions only in certain
quantum of energy. A vacuum triode-tube with three electrodes, namely cathode, grid
and plate was used as shown in figure. In evacuated tube the plate current rises mono-
tonically as a function of voltage V between cathode and grid. This happens as more
electrons that are emitted by heated cathode reach the plate when V is increased. Now
when the tube contained mercury vapor which is achieved by heating the tube to near
170◦ C to evaporate mercury sealed in the tube. The electrons that accelerate collide
with mercury atoms and loose some of the energy. It was found that the plate current
showed a non-monotonous behavior with V . In fact it showed a periodic pattern with
4.9 V period together with a rising background as shown in figure. This was interpreted
using the excitation energy of mercury atom as 4.9 eV and thus electrons loose energy
11
in collisions with mercury atoms only in multiple of 4.9 eV. A given electron can loose
energy to multiple Hg atoms within the limit of its total kinetic energy.
matter objects with large mass, λ is very small so these objects follow laws of classical
physics. Even for electrons λ is quite small so one has to look for proper experiments to
verify its wave nature.
12
Figure 11: Davisson-Germer experiment.
beams that exist precisely because of their wave nature. For instance scanning electron
microscope (SEM) gets a resolution in sub-nanometer range precisely because one can get
electron beams with sub-Å de Broglie wavelength and thus the diffraction limit leads to a
sub-Å resolution. In fact the eventual resolution limit in SEM comes from other consid-
erations including the energy spread and inter-electron interaction in the electron beam.
The transmission electron microscope (TEM) is another important tool that utilizes wave
nature of electrons to image solids with atomic resolution. Two important surface probes
are low energy electron diffraction (LEED) and reflecting high energy electron diffraction
(RHEED) which are used to monitor and track the atomic structure of the surface of the
thin films during film growth/deposition.
13
2.1 Heisenberg’s uncertainty principle
On the basis of wave-particle duality and de Broglie hypothesis Heisenberg proposed that
the uncertainty in a given particle’s position and momentum are inversely related. More
specifically he stated ∆p.∆x ≥ ℏ/2. We start this discussion by considering a wave,
y(x, t) = A sin(kx − ωt) with k = 2π/λ and ω = 2πν. As defined this wave exists
over whole space and, at all times. Now to get a space-localized wave or disturbance
one actually has to superpose waves of different λ values. These interfere to give a
rather localized entity that can describe a particle-like object. To get a feel of this let’s
superpose two waves of slightly different k. So we have y1 (x, t) = A sin(k1 x − ω1 t) and
y2 (x, t) = A sin(k2 x − ω2 t) and
. On simplifying, y(x) = 2A sin[(k1 +k2 )x/2−(ω1 +ω2 )t/2] cos[(k1 −k2 )x/2−(ω1 −ω2 )t/2].
This at a fixed time, say t = 0 shows a beat-like pattern if one plots it as a function of
x as shown in figure below. The width of each wave-group ∆x will be given by ∆x =
2π/(k1 − k2 ). The wave is made up of two different waves with k-values as k1 and k2
and thus one can see that the spread in k value is ∆ ∼ |k1 − k2 |. Using de Broglie’s
hypothesis, the spread in momentum is ∆p = ℏ|k1 − k2 |. Thus we get product of the
spread (or uncertainty) in x and p space as ∆px = [2π/(k1 − k2 )] × ℏ|k1 − k2 | = 2πℏ = h.
Figure 13: The top and middle show two waves with slightly different wavelengths and
the lower panel shows there superposition exhibiting beats-like pattern in real space.
In above illustration the beats-pattern still fills up the whole space and in order to
really get a space-confined wave one needs to superpose more k-waves. This is better
illustrated if we use the Fourier transforms to superpose many waves over a range of
k. For example if we superpose waves with different k complex waves, exp(ikx), such
14
Rthat the amplitude of the k-component is g(k) we’ll get a real space wave as y(x) =
g(k) exp(ikx)dk. This is actually the inverse Fourier transform of g(k). For instance,
if we choose g(k) = exp(−|k|/k0 ) which in k-space has a width ∆k ∼ k0 , we get y(x) ∼
[x2 +k0−2 ]−1 . This is a Lorenzian of width ∆x ∼ k0−1 . Thus we get the uncertainty product
∆p.∆x ∼ ℏ. Thus we see the connection of this uncertainty principle with the spread of
a complex function in x-space and p-space. This seems natural once we associate p with
λ as one needs many different λ or k-waves in order to create a localized wave that to
describe a particle.
15
Figure 14: The left schematic shows the diffraction of a plane wave when one tries to
constrain it using a slit. The right figure shows the objective lens of an optical microscope
which focuses the photons scattered from a particle in order to measure the particle’s
position.
given by ∆px = (h/λ) sin θ. This means that that the photons that we detect would
cause this much uncertainty in particle’s momentum. The lens has a diffraction limited
resolution of λ/ sin θ. So the particle will be detected with a minimum position uncer-
tainty of ∆x = λ/ sin θ. Thus we get the uncertainty product as ∆px .∆x = h. This
means that an effort to measure the position of a particle with precision ∆x will lead
to an uncertainty in its momentum of order h/∆x. This experiment makes the result
appear as an experimental limitation. However the analysis points towards an inherent
limitation. Measurements complicate and may be shadow this inherent nature which is
actually arising from the wave nature of particles.
16
If for a physical system any natural dynamical variable which has the dimensions of
action assumes a numerical value comparable to Planck’s constant h then the behavior of
the system must be described by quantum mechanics. On the other hand if every variable
with action dimensions is very large in comparison to h, then the laws of classical physics
are valid with good accuracy.
Figure 15: The probability distribution of particle wave in x and p space together with
the potential energy and kinetic energy in two spaces. A finite distribution of probability
away from origin leads to finite kinetic and potential energies in minimum energy state
which is also called ground state.
We begin with the total energy expression, E = (mω 2 x2 /2) + (p2 /2m). The particle
has to have a non-zero ⟨p2 ⟩ and ⟨x2 ⟩ in order to have finite ∆x.∆p. What all this means
is that the particle in its ground state will not be localized at one point in either x or in p.
So the particle will have a probability to take up various x or p values leading to a finite
spread. This aspect is driven by uncertainty principle and the underlying physics, as we
shall see more later, is basically driven by the wave-nature of particle. From symmetry and
for minimum energy we can assert that ⟨p⟩ = 0 and ⟨x⟩ = 0. Remember, ⟨p2 ⟩ = ∆p2 +⟨p⟩2
will assume a minimum value (for minimum KE) when ⟨p⟩2 is minimum, i.e. zero. The
same logic is used for potential energy term to deduce ⟨x⟩2 for minimum potential energy.
Thus we get ⟨p2 ⟩ = ∆p2 and ⟨x2 ⟩ = ∆x2 . We minimize total energy keeping the
minimum uncertainty product, i.e. ∆x.∆p = ℏ/2 or ∆p = ℏ/2∆x. These arguments lead
ℏ2
to E = 21 mω 2 x2 + 2m 1
4∆x2
. Minimizing with respect to ∆x, i.e. ∂E/∂∆x = 0 leads to
∆x = ℏ/2mω and eventually we get Emin = ℏω/2. This estimate of minimum turns out
2
to be exact for SHO as we shall see later from the detailed solution.
17
2.6 Matter waves and postulates of quantum mechanics
If we accept the wave description for matter particles, the natural question to ask is what
quantity or variable does this wave represent and what equation controls the dynamics
of this variable. Moreover, what is the prescription for deducing the experimentally
measurable quantities from this variable? For an em-wave we know that E ⃗ and B ⃗ describe
it in classical sense at least and Maxwell’s equations describe the dynamics of E ⃗ and B ⃗
fields. Guided by em-waves here are the starting postulates of quantum mechanics. This
part is more or less copied from Cohen-Tannoudji book. Here I list these postulates.
1) The particle and wave aspects of light (or matter particles) are inseparable. Light
behaves simultaneously like a wave and a flux of particles. The wave enabling us to
calculate the probability of manifestation of a particle.
2) Predictions about the behavior of a photon (or matter particle) can only be prob-
abilistic.
3) The information about the photon at time t is given by the wave E(⃗r, t) which
a solution of Maxwell’s equations. We say that this wave characterizes the state of the
photons at time t. E(⃗r, t) is the amplitude of a photon occupancy at time t and position
⃗r. That means that corresponding probability is proportional to |E(⃗r, t)|2 .
4) This analogy between light and matter is good background to talk about the wave-
function in analogy with E(⃗r, t). For particles behave like wave with λ = h/p and ν = E/h
as suggested by de Broglie. This also gives p = ℏk with k = 2π/λ, i.e. wave vector and
E = ℏω.
5) The quantum state of a particle is characterized by a wave-function ψ(⃗r, t) which
contains all the information it is possible to obtain about the particle.
6) ψ(⃗r, t) is the probability amplitude of particle’s presence. So the probability of
finding the particle in a small volume dτ at position ⃗r and at time t is given by dP (⃗r, t) =
|ψ(⃗r, t)|2 dτ . So |ψ(⃗r, t)|2 is the probability density.
R Since the probability of finding the
particle somewhere should be one, this means R 2 |ψ(⃗r, t)|2 dτ = 1. We can normalize it by
multiplying by a constant C such that C |ψ(⃗r, t)|2 dτ R = 1. In2 this case it is required
that any wave-function must be square integrable, i.e. |ψ(⃗r, t)| dτ should be finite.
7) With this description and as compared to classical mechanics, the particle now
is described by infinite number of variables, i.e. ψ(⃗r, t) is a continuous function while
classically we only need ⃗v and ⃗r at time t.
8) This is analogous to light and E(⃗r, t) in the sense that |E|2 represents photon’s
probability. Also just like E, ψ follows the superposition principle, i.e. ψ1 + ψ2 describes
the interference of two wave-functions of the same particle with |ψ1 + ψ2 |2 describing the
resulting probability density.
9) ψ(⃗r, t) allows one to find the probability of occurrence. The experimental verifica-
tion must than be founded on the repetition of a large number of identical experiments.
10) ψ(⃗r, t) is complex in quantum physics unlike E for em-waves where the complexity
of E helps mathematically is solving the wave equation systematically. Actually, the
precise definition of the complex quantum state of radiation can only be given in the
frame work of quantum electrodynamics, which is out of the scope of this course.
11) Measurements are very important in quantum mechanics as they seem to interfere
with the natural state of the system. So let’s discuss them more systematically.
12) A certain measurement gives results that belong to a certain set of characteristic
18
results called eigen-values, i.e. {a}. This set can be discrete or continuous depending on
the system and the type of measurement.
13) With each eigen-value there is an associated eigen-state wave-function ψa (⃗r). So
if a measurement yields a value ‘a0 ’, the wave-function afterwards would be ψa0 (⃗r). Any
further measurements, without any time evolution, will always give the same value ‘a0 ’.
It is always possible to P write a general wave-function as a linear combination of eigen-
funcions, i.e. ψ(⃗r, t0 )P= a Ca ψa (⃗r). Then the probability of a measurement giving result
‘a0 ’ is Pa0 = |Ca0 |2 / a |Ca |2 .
Illustration: The idea of measurement can be illustrated using polarizers in the way
of a single photon. Consider two polarizers with their easy-axes at an angle θ with respect
to each other. The are kept with their planes perpendicular to the light beam as shown
in figure.
Figure 16: Photon transmission through two polarizers with easy axis at different angles.
See text for details.
The first polarizer polarizes the light beam along x-direction so if the original beam was
unpolarized, half of the intensity would be lost. Now after second polarizer the intensity
would be further reduced by a factor cos2 θ. The remaining, proportional to sin2 θ, is
absorbed by the polarizer.
Now we reduce the intensity so much that only one photon crosses the polarizers at a
time. So after the photon crosses the first polarizer we know that its polarization is along
the x-axis. Now what happens when it crosses the second one? The probability that it
would cross is cos2 θ and the probability of its absorption is sin2 θ. But if we work with
one photon only than only one thing can happen, i.e. absorption or transmission. Now
if it gets transmitted it’ll have a polarization êp , i.e. if we put more polarizers after the
second one with their axes along êp , the photon will get transmitted through all of them
with 100% probability.
Thus the result of a measurement is either one of the eigen-values of the measure-
ment or the state gets annihilated completely in this particular case. So the measurement
disturbs the system in a fundamental way. If the result is certain eigen-value then the
resulting state is going to be the corresponding eigen-state after the measurement. By
knowing the original state, one can predict the probability of various eigen-values which
can then be verified by doing the experiment on several such identical particles or sys-
tems. In this context the Stern-Gerlach experiment is also relevant, which is dicussed in
19
Feynmann lectures, vol.3.
14) The equation describing the time evolution of ψ(⃗r, t) is the Schrödinger equation
given by,
∂ψ ℏ2 2
iℏ =− ∇ ψ + V (⃗r)ψ(⃗r, t). (18)
∂t 2m
We shall not attempt to prove it. It is one of the fundamental equations that could be
derived by higher level physics. However, historically it was just proposed based on the
de Broglie’s wave ideas and verified experimentally later on.
This equation has the following properties:
(A) It is linear and homogeneous in ψ, which means that the superposition principle
is valid, i.e. if ψ1 (x, t) and ψ2 (x, t) are two solutions of this equation than ψ(x, t) =
C1 ψ1 (x, t) + C2 ψ2 (x, t) is also a solution. Here C1 and C2 are two complex numbers.
(B) It is a first order differential equation in time. This is necessary if the state of the
particle at t0 (alone) is to determine its state later. If it were a second order equation in
time then one will need both ψ and ∂ψ | in order to find ψ(t).
∂t t0
(C) We also recall that for ψ(⃗r, t) to be a valid wave-function, it has to be square-
integrable.
If we compare with the usual wave-equation which is second order in time, here we have
genuinely complex solutions. But the similarities with the wave-equation are quite striking
and the intuition derived from wave phenomena is often correct in quantum mechanics but
not always. For instance the Huygen’s construction is not valid in quantum mechanics.
∂ψ ℏ2 ∂ 2 ψ
iℏ =− . (19)
∂t 2m ∂x2
In order to solve this equation we use separation of variables, which works as the space and
time derivative terms in above are decoupled. We write a possible solution as ψ(x, t) =
ψ1 (x)χ(t) leading to
iℏ ∂χ ℏ2 ∂ 2 ψ1
=− = E. (20)
χ ∂t 2mψ1 ∂x2
Here E is a constant which is actually the energy as it relates to the frequency of wave-
function evolution. The last equality follows as the first and second terms are dependent
on variables x and t and therefore they have to be equal to same constant. This leads
ℏ2 ∂ 2 ψ1
iℏ ∂χ
to two equations, namely, χ(t) ∂t
= E and − 2mψ 1 ∂x
2 = E. The first one has a solution
χ(t) = C1 exp(−iEt/ℏ) while the second one has solution ψ(x) = A1 eikx + B1 e−ikx with
k 2 = 2mE/ℏ. Here, E must be a positive number in order to have the expected wave-lik
solution for free-particle. Thus we see how E gets related to frequency and wave-vector
consistent with the de Broglie hypothesis, i.e. ω = E/ℏ and E = ℏ2 k 2 /2m. We can
20
eliminate E to get the dispersion relation ω = ℏk 2 /2m which is different from the em-
wave dispersion, i.e. ω = ck. The non-linear dispersion implies that the matter waves
are dispersive. Moreover, for matter waves, the phase velocity vp = ω/k = ℏ2 k/2m and
group velocity vg = dω/dk = dE/dp = ℏk/m = p/m or p = mvg .
For a given E the most general solution is
with A and B as two independent constants. The first term represents a wave propagating
in +x direction while the second one is a wave moving along −x direction. In order to
proceed with a general solution for free particle let’s first discuss a special function called
Dirac delta function.
2) Its integral over a range including the singular point is one, i.e.
Z x0 +ϵ
δ(x − x0 )dx = 1 (23)
x0 −ϵ
Here ϵ can be an arbitrarily small number. Thus, a δ-function has dimensions of inverse of
it argument, i.e. in above case (Length)−1 . This is rather complex and it’s better to look
at this function as limit of a series of functions. I use two such constructions to illustrate
this.
1) We take a box-like function of height 1/2a and width 2a centered at origin, i.e.
(
1 1/2a for |x| < a
f (x) = Θ(a − |x|) = (24)
2a 0 for |x| > a
Here Θ is the step-function. Thus in the limit a → 0 this function assumes a very narrow
width and very large value at origin, see figure. At the same time it’s integral over a range
including origin is one. Thus we can assert δ(x) = lim f (x).
a→0
2) We take a limit of the sinc-function. This is rather useful for the context of Fourier
transform involved in the above free-particle wave-function discussion. We directly define
δ-function is k-space as,
sin kL
g(k) = (25)
πk
21
This function is oscillatory, though not periodic, with period ∆k = 2π/L. Its magnitude
away from origin is below 1/πk and its value at the origin is L/π. Here we take the
limit L → ∞. In this limit, the oscillation period approaches zero and it’s value at origin
diverges. So its average value in a small k-range, away from origin, will be zero.R Thus
it acquires a δ-function like appearance. To prove the second property we need g(k)dk
which we evaluate over full k-range with the understanding that the integral will get
zero contribution from region away from origin as the integrand oscillates with very small
period. Moreover, we’ll have to use the residue method to evaluate this integral.
We consider the following integral,
Z ∞ iz iz
e 1 e
dz = × 2πi Res = iπ. (26)
−∞ z 2 z=0 z
The last step uses the Rfact that exp(iz)/z only R ∞hasiza pole at originRand with residue 1.
∞ ∞
Finally using the fact −∞ (sin x/x)dx = Im[ −∞ (e /z)dz] we get −∞ (sin x/x)dx = π
R∞
and hence −∞ g(k)dk = 1.
A few other useful properties ofR δ-function are:
x
1) dxd
Θ(x − x0 ) = δ(x − x0 ) or −inf ty δ(x − x0 ) = Θ(x − x0 )
2) Rδ(−x) = δ(x), i.e. it’s a symmetric function,
3) g(x)δ(x−x P0 ) = g(x0 ), if the integration range encompasses x0 and zero otherwise.
4) δ(f (x)) = xn (1/|f ′ (xn )|)δ(x − xn ).
Here, the sum runs over xn which are the zeroes of f (x), i.e. f (xn ) = 0. The last one
also leads to δ(αx) = (1/|α|)δ(x).
The second
R ∞ ikxconstruction of Rδ-function discussed above is useful when one tries to
L
evaluate −∞ e dx. We have −L e dx = (1/ik)(eikL − e−ikL ) = 2 sin kL/k. So for
ikx
L → ∞, we get
Z ∞
sin kL
eikx dx = 2π lim = 2πδ(k). (27)
−∞ L→∞ πk
Here, Ap is a complex number which depends on p and the above sum is carried out
over all (positive and negative) values of p. In fact, for infinite space in 1-D, p can take
continuous values and thus a general wave function is more appropriate to write as,
Z ∞
px p2 t
ψ(x, t) = A1 g(p) exp i − dp. (28)
−∞ ℏ 2m
22
Here A1 is a normalization constant and g(p) is a complex function of p. At t = 0, we
have
Z ∞ h px i
ψ(x, 0) = A1 g(p) exp i dp. (29)
−∞ ℏ
leading to
Z ∞
1
g(p) = ψ(x, 0)e−ipx/ℏ dx. (30)
A1 2πℏ −∞
The above is the defining relation for Fourier transform. Our objective is to find A1 so
that both ψ(x, 0) and g(p) are normalized. For this,
Z ∞ Z ∞
|g(p)| dp =
2
g(p)g ∗ (p)dp
−∞ −∞
Z ∞ Z ∞ Z ∞
1 −ipx/ℏ ∗ ′ ipx′ /ℏ ′
= ψ(x, 0)e dx ψ (x , 0)e dx dp
(2πℏ|A1 |)2 −∞ −∞ −∞
Z ∞Z ∞ Z ∞
1 ∗ ′ [ip(x′ −x)/ℏ]
= ψ(x, 0)ψ (x , 0) exp dp dxdx′
(2πℏ|A1 |)2 −∞ −∞ −∞
Z ∞Z ∞
1
= ψ(x, 0)ψ ∗ (x′ , 0)2πℏδ(x′ − x)dx′ dx
(2πℏ|A1 |)2 −∞ −∞
Z ∞
1
= |ψ(x, 0)|2 2πℏdx
(2πℏ|A1 |)2 −∞
Z ∞
1
= |ψ(x, 0)|2 dx
2πℏ|A1 |2 −∞
Now for both g(p) and ψ(x, 0) to be normalized we should have |A1 |2 = 1/2πℏ. Thus we
write the full expressions for the Fourier transform and its inverse, respectively, as
Z ∞
1
g(p) = √ ψ(x, 0)e−ipx/ℏ dx (31)
2πℏ −∞
23
and
Z ∞
1
ψ(x, 0) = √ g(p)eipx/ℏ dp. (32)
2πℏ −∞
Please note that Inverse Fourier Transform (IFT) of a Fourier Transform (FT) of a
given function leads to the original function. In fact FT (or IFT) is a linear transformation
implying many simplifications such √ as, FT[c1 ψ1 (x) + c2 ψ2 (x)]=c1 FT[ψ1 (x)]+c2 FT[ψ2 (x)].
Moreover, as defined above with 1/ 2π pre-factor, it preserves the norm of the function.
It turns out that it is a unitary transformation, which will become clearer later when we
discuss linear vector spaces.
2) Wave equation has general solutions of the form f (x ± vt) which is not possible for
Schrödinger equation.
3) Wave equation, being second order in time, requires two initial conditions to find
y(x, t). These can be y(x, 0) and ẏ(x, 0).
4) Boundary conditions with respect to x, for time independent equations, often have
similar nature. For example, the solution vanishes for both the particle in a well and the
electric field in a cavity at the boundaries. This leads to similar solutions for both cases.
5) Wave equation also has plane wave solutions of form e(ikx±ωt) with ω being real-
positive and k taking any real values. This follows from separation of variables. Since
the wave-equation
R∞ is a linear homogeneous
R∞ equation, a general solution can be written as,
y(x, t) = −∞ A− (k)ei(kx−ωt) dk + −∞ A+ (k)ei(kx+ωt) dk with ω = vk. The initial conditions
R∞ R∞
lead to y(x, 0) = −∞ [A+ (k) + A− (k)]eikx dk and ẏ(x, 0) = iω −∞ [A+ (k) − A− (k)]eikx dk.
Suppose Y (k) and Z(k) are Fourier transforms of y(x, 0) and ẏ(x, 0), respectively then
these can be used to get A+ (k) = [Y (k) + Z(k)/iω]/2 and A− (k) = [Y (k) − Z(k)/iω]/2.
These can now be used to find y(x, t).
6) In wave-equation solutions y is always a real number and complex notation is
only for mathematical convenience. Eventually one takes real or imaginary part of the
mathematical complex solution as valid physical solution. This is in contrast with the
Schrödinger equation.
24
3.4 Free-particle Gaussian wave-packet
Here we construct a localized wave-packet like solution to TDSE to simulate how a particle
like entity would evolve with time according to the TDSE. We discuss a specific wave-
packet called the Gaussian wave-packet, given by
a2
g(p) = A exp − 2 (p − p0 ) . 2
(35)
4ℏ
Here a is a constant with dimension of length. A is to be found from the normalization
condition,
Z ∞ Z ∞
a2
|g(p)| dp = |A|
2 2
exp −2 2 (p − p0 ) dp = 1.
2
(36)
−∞ −∞ 4ℏ
We use the integral
Z ∞ p
exp [−α(ξ + β)2 ]dξ = π/α (37)
−∞
. This result is valid for complex α and β, if the Re[α] is positive so that the integral
converges. We can also differentiate this integral wrt α to derive
Z ∞
√
ξ 2 exp [−αξ 2 ]dξ = π/2α3/2 (38)
−∞
for β = 0 p
This helps us evaluate the integral in eq.36 to get |A| = (2π)−1/4 a/ℏ. Thus we get,
r
1 a a2
g(p) = exp − 2 (p − p0 ) .2
(39)
(2π)1/4 ℏ 4ℏ
Now to work out the wave-function in x-space we need to find its inverse Fourier trans-
form using eq. .This is a bit cumbersome but straightforward after we use the Gaussian
integration identities. Here are the main steps,
2 1/4 Z ∞
a a2
ψ(x, 0) = exp − 2 (p − p0 ) + ipx/ℏ dp
2
8π 3 ℏ4 −∞ 4ℏ
2 1/4 Z ∞ " 2 # 2
a a2 2ℏx x ip0 x
= exp − 2 p − p0 − i 2 exp − 2 + dp
8π 3 ℏ4 −∞ 4ℏ a a ℏ
1/4 2
2 x ip0 x
= exp − 2 + . (40)
πa2 a ℏ
Remember this is the wave function at time t = 0 and we are yet to workout the non-zero
time wave-function. It is clear that this wave-packet is localized in both x and p spaces. In
p-space, g(p) real and symmetric about p0 while in x-space the wave-function is complex
and its real and imaginary parts are oscillatory with wave-vector p0 /ℏ and both have a
Gaussian envelope centered at x = 0. Thus the wave-function is spread over a finite x
and p range leading to uncertainty in both.
25
3.5 Uncertainty product for Gaussian wave-packet
RLet us evaluate thesep uncertainties
R explicitly. For ∆x2 = ⟨x2 ⟩ − ⟨x⟩2 , we have ⟨x⟩ =
∞ ∞
−∞
x|ψ(x, 0)|2 dx = 2/πa2 −∞ x exp (−2x2 /a2 )dx = 0. The integral here gives zero as
the integrand is an odd function of x.
Now,
Z ∞
⟨x ⟩ =
2
x2 |ψ(x, 0)|2 dx
r−∞ Z ∞ r √
2 2 π a2
= x 2
exp (−2x 2
/a2
)dx = × = .
πa2 −∞ πa2 2.(2/a2 )3/2 4
p
Thus we get ∆x ⟨x2 ⟩ − ⟨x⟩2 = a/2.
Now we go after ∆p2 = ⟨p2 ⟩ − ⟨p⟩2 . For this,
Z ∞ Z ∞
a a2
⟨p⟩ = p|g(p)| dp = √
2
p exp − 2 (p − p0 ) dp
2
−∞ 2πℏ −∞ 2ℏ
Clearly the first term in above vanishes and the second uses the Gaussian integral, eq.
37, to obtain,
1/2
p 2πℏ2
⟨p⟩ = (ap0 / 2/πℏ) = p0 .
a2
Here again use has been made of eq. 37 and 38 in evaluating above integral. Thus
we get ∆p2 = ⟨p2 ⟩ − ⟨p⟩ = p20 + (ℏ2 /a2 ) − p20 = (ℏ2 /a2 ). Hence, for this Gaussian wave-
packet, the uncertainty product ∆p.∆x = (ℏ/a).(a/2) = ℏ/2. This is the minimum
possible uncertainty product. In fact, one can prove rigorously that the above Gaussian
is the only wave-packet that leads to the minimum uncertainty product. In fact, not all
Gaussian wave-packets will lead to the same outcome.
26
3.6 Time evolution of the Gaussian wave-packet
We follow the general strategy of solving TDSE for free particle for given initial ψ(x, 0).
In fact we started with a g(p) and worked out the ψ(x, 0). Now, according to earlier
discussion as per eq. 28, the corresponding time dependent wave-function is given by,
Z ∞
1 px p2 t
ψ(x, t) = √ g(p) exp i − i dp. (41)
2πℏ −∞ ℏ 2mℏ
h 2 i p
Here, g(p) = A exp − 4ℏ a
2 (p − p 0 ) 2
with A = (2π)11/4 aℏ . The mathematics of the evalu-
ation of the above integral is rather tedious though doable analytically. It involves a few
integrals that use eqs. 37 and 38 with complex α and β. You can work it out if you feel
up to it. I’ll give the final result here,
1/4 r 2 2
2 a a p0 x − i(a2 p20 /2ℏ)
ψ(x, t) = exp − 2 exp − 2 . (42)
π a2 + (2iℏt/m) 4ℏ a + 2i(ℏt/m)
This looks rather complicated. We can workout the time dependence of probability dis-
tribution in real space, which gives,
s
∗ 2 (x − p0 t/m)2
P (x, t) = ψ(x, t)ψ (x, t) = exp − . (43)
πa(t) a(t)2
Here,
4ℏ2 t2
a(t)2 = a2 + (44)
m2 a2
We notice that |g(p)|2 , i.e. probability distribution in p-space, does not change while
P (x, t) changes with time. This also means that uncertainty ∆p = ℏ/a remains same,
while the uncertainty ∆x = a(t)/2 is time dependent. Thus the product,
1/2
ℏ 4ℏ2 t2 ℏ
∆x.∆p = 1+ 2 4 ≥ . (45)
2 ma 2
27
4) One can also make out how the small λ plane-waves move faster than the large λ
leading to dispersion or spread in the wave-packet.
5) One can understand the increase in spread easily by making a parallel with an
athletic race. In a race all runners start from the same start line but after race starts
there position has a spread as all the runners run with different speed or in other words
there is a spread in velocity. One can easily argue that the real space spread in runners
position, i.e. ∆x, will be given by ∆x = ∆vt. This is the case for Gaussian wave packet
for large times where ∆x = ∆p/m as ∆p = ℏ/a.
6) We can estimate this rate of spread in x-space for macroscopic objects. So for m =
1 µgm and a = 1 µm we get d∆x/dt = ∆p/m = ℏ/ma ∼ 10−22 m/s. This is extremely
small as compared to the particle size and macroscopic velocity scales (m/s). It’ll take
about 1016 s or 109 years to see doubling of the size due to dispersion.
Similarity to diffusion: It turns out that the Schrödinger equation for a free particle,
2 2
i.e. ∂∂xψ2 = −i 2m
ℏ
∂ψ/∂t has similar structure as the diffusion equation, i.e. ∂∂xT2 = D1 ∂T /∂t.
κ
Here T , for instance, can be temperature where thermal diffusivity D = σc with κ as
thermal conductivity, c as specific heat and ρ as density. The diffusion equation applies
to many phenomena such as diffusion of atoms or chemical species, charge, heat. This
equation has solutions similar ”diffusive” (similar to ”dispersive”) solutions when a local-
ized distribution at t = 0 spreads as time progresses. In fact the mathematics of solving
the diffusion equation is similar to that of the Schrödinger equation for free particle. One
difference in diffusion case is as time progresses the spatial spread always increases while
for matter waves it can increase or decrease both as we see above.
28
The last equality follows from Schrödinger equation. So iℏ(∂/∂t) is equivalent to the
energy operator.
Another entity that we introduce here is the matrix element of an operator. We’ll
discuss a more detailed approach to such entities later when we discuss linear vector
spaces. However, we need these to develop the machinery of quantum mechanics further.
We define the matrix element of an operator
R ∗ ‘A’ between two states described by wave-
functions ψ(x) and ϕ(x)
R as ⟨ψ|A|ϕ⟩ = ψ (x)Aϕ(x)dx. This is different from the matrix
element ⟨ϕ|A|ψ⟩ = ϕ∗ (x)Aψ(x)dx as in the former case ‘A’ operates on ϕ(x) while in
later it operates on ψ(x). The expectation value defined above is a special matrix element,
i.e. ⟨A⟩ψ = ⟨ψAψ⟩.
Another concept is that of Hermitian operators. Again this concept will be discussed
in more detail later. It turns out that all eigen-values of a Hermitian operator are real
or vice-versa, i.e. any operator with all eigen-values real is Hermitian. Thus in quantum
mechanics any observable, such as position, momentum, energy, is associated with a
Hermitian operator. We need to define this Hermitian property mathematically. An
operator ‘A’ is called Hermitian if its matrix element over any arbitrary wave-functions
ϕ(x) and ψ(x) satisfy: ⟨ψ|A|ϕ⟩ = ⟨ϕ|A|ψ⟩∗ or in another way
Z Z Z
ψ (x)Aϕ(x)dx = ( ϕ (x)Aψ(x)dx) = [Aψ(x)]∗ ϕ(x)dx.
∗ ∗ ∗
(46)
The eigen state ϕE (x) is also called a stationary state as it does not evolve with time except
for a phase factor. For a given initial state ψ1 (x, 0) at t = 0 if one wants to find its time
29
P
evolution one has to find the complex coefficients AEα such that ψ1 (x, 0) = AEα ϕEα (x).
One can then use these AEα in eq. 47 to write the time-dependent solution.
The fact that any arbitrary ψ1 (x) (physically acceptable) can be written as a linear
superposition of ϕEα for a general Hamiltonian is not trivial. But it turns out that
these ϕEα always form a complete set. Thus these ϕEα are called Eigen (characteristic)
functions.
The fixed energy states have special significance because one can observe transitions
between these states and the energy difference between these states is easily observed as
emitted (or absorbed) radiation at specific frequencies.
30
Some remarks on TISE:
1) TISE has no ‘i’ so its solution need not be complex but it can be if convenience
dictates. For example for free particle we chose exp (±ikx) but we could have chosen
sin kx and cos kx as well.
2) An eigen-state wave-function ϕ(x) must be finite, single valued, continuous.
3) dϕ/dx must also be finite, single valued and continuous.
These properties of TISE and ϕE (x) sometimes (for bound states) force that the
solutions of TISE are found at certain energies only. This leads to quantized energies for
bound states as we’ll see soon. A good reference to see how this happens is sec. 5.7 of
Eisberg and Resnick.
It should also be pointed out that some of the mathematical restrictions are marginally
violated for ideal (unphysical) potentials that go to infinity or have a step jump. For
instance if potential abruptly jumps to infinity at some point, as for infinite well, dψ/dx
can be discontinuous.
ℏ2 ∂ 2 ψ ∂ψ
Hψ = − 2
+ V (x)ψ(x) = iℏ
2m ∂x ∂x
The probability density ρ(x, t) = |ψ(x, t)|2 assuming the wave-function ψ(x, t) to be nor-
malized. As a result of time evolution of ψ the probability density will change with time.
However, when the probability in certain given region changes we expect to see a flow
of this probability at the boundaries. This is a local conservation of probability. This
is similar to charge conservation in the sense that when the charge contained in certain
region of space changes with time we expect to see a charge current at the boundary of
∂ρ
this region. This leads to the continuity equation: ∂tQ + ∇.⃗ J⃗Q . We are after a similar
concept here. We have,
∂ρ ∂ ∂ψ ∂ψ ∗
= (ψψ ∗ ) = ψ ∗ +ψ
∂t ∂t ∂t ∂t
Using the TDSE for time derivatives this gives,
∂ρ 1 −1
= ψ ∗ Hψ + ψ Hψ ∗
∂t iℏ iℏ
Rx
Thus the probability for finding the particle between x1 and x2 , i.e. P (x, t) = x12 ρ(x, t)dx,
will change with time as
Z x2 Z
∂P ∂ρ 1 x2 ℏ2 ∂ 2 ψ ∗ ℏ2 ∂ 2 ψ ∗ ∗
= dx = − +Vψ ψ − − + V ψ ψ dx
∂t x1 ∂t iℏ x1 2m ∂x2 2m ∂x2
Z x2
ℏ 2
∗∂ ψ ∂ 2ψ∗
=− ψ −ψ dx
2im x1 ∂x2 ∂x2
31
We use integration by parts to get
Z Z x2
∂P ℏ ∗ ∂ψ ∂ψ ∗ ∂ψ ∂ψ ∗ ∂ψ ∗ ∂ψ
=− ψ − dx − ψ − dx
∂t 2im ∂x ∂x ∂x ∂x ∂x ∂x x1
∗ x2
ℏ ∂ψ ∂ψ
=− ψ∗ −ψ
2im ∂x ∂x x1
= −[S(x2 ) − S(x1 )]
x1 ∂t
dx or in other words,
∂ρ ∂S
+ =0 in 1-D
∂t ∂x
∂ρ ⃗ ⃗
+ ∇.S = 0 in 3-D (49)
∂t
In 3D this is actually a vector quantity. It basically represents probability
h i flux, i.e.
ℏ
⃗ = Im ψ ∇ψ ∗ ⃗ . Eq. 49 is
probability flow per unit area per unit time. In 3D, we have S m
the continuity equation describing local probability conservation just like the continuity
equation for charge conservation in em-theory.
32
Figure 17: Illustration of time-frequency uncertainty in a Harmonic oscillator. The
damped motion in time-domain in the left panel and the driven motion in frequency
domain is consisten with ∆t.∆ν ∼ 1
.
evolves significantly over a time scale ℏ/|E1 − E2 | i.e. ℏ/∆E. For instance probability of
measuring the system in two orthogonal states ϕE1 (x) + ϕE2 (x) and ϕE1 (x) − ϕE2 (x) works
out to be cos2 (∆Et/ℏ) and sin2 (∆Et/ℏ), respectively. These have significant evolution
over time scale ∆t ∼ ℏ/∆E. This idea can also be mapped to a classical coupled harmonic
oscillator system.
3) In nature when atoms emit em-radiation at characteristic frequency the emission
line has a natural line-width ∆λ. This measured line-width can arise from extrinsic factors
such as Doppler broadening and from the intrinsic lifetime τ of the excited states. This
τ and ∆E = hc∆λ/λ2 are related by ∆Eτ ∼ ℏ. Typical values of τ are in 10 ns range
giving ∆λ/λ ∼ 10−10 for ν ∼ 101 4 Hz.
4) Let’s try to measure the momentum of a free particle of mass m by observing its
displacement over time ∆t, i.e. ∆t = ∆x/v = m∆x/p assuming that the particle moves
by ∆x in ∆t. Now suppose the uncertainty in momentum measured in this way is ∆p
then ∆E = ∆(p2 /2m) = p∆p/m. Thus ∆E.∆t = (p∆p/m).(m∆x/p) = ∆p.∆x ≥ ℏ/2.
5) Suppose if one tries to measure E by looking at the time evolution of the wave
function and finding the frequency with which it is evolving. This sounds difficult in QM
but in classical oscillators one does measure the frequency in this way. If one measures
the behavior over time ∆t then the accuracy (or uncertainty) in frequency measurement
will ∼ 1/∆t as dictated by the Fourier transform in time domain. This also leads to
∆E.∆t ∼ ℏ
6) Another interpretation relates to an explanation of quantum tunneling where there
is an apparent violation of energy conservation, at least for a short time, when particle
crosses the barrier through quantum tunneling. It seems that one can actually create or
destroy ∆E energy for a time period ∼ ℏ/∆E. The fact that a particle goes across a
barrier seems to violate the energy conservation. A reconciliation is that an an energy
∆E can temporarily be borrowed for a time ∆t ≤ ℏ/∆E if this time is enough for going
across. We should understand that this extra energy is temporary and tunneling is a
perfectly elastic process.
33
4.5 Heisenberg equation of motion
In QM any given measurable
R ∗ quantity corresponds to an operator, say B, whose expec-
tation value ⟨B⟩ = ψ (x, t)Bψ(x, t)dx is a useful and measurable quantity. Its time
evolution is also important. It turns out that finding out how this expectation value
does not necessarily require solving for the time dependent wave-function. This can be
obtained using the Heisenberg equation of motion as we discuss here. ⟨B⟩ for a time-
independent operator will change with time as for a general state ψ(x, t) changes with
time. As mentioned earlier, any measurable quantity will correspond to a Hermitian
operator as defined by eq. 46. We have,
Z Z ∗
∂ ∂ ∗ ∂ψ ∗ ∂B ∂ψ ∗
⟨B⟩ = (ψ Bψ) dx = Bψ + ψ ψ+ ψ dx
∂t ∂t ∂t ∂t ∂t
Using the TDSE for the first and third term, we get
Z
∂ ∂B 1 ∗ ∗ 1
⟨B⟩ = ⟨ ⟩+ − (Hψ )Bψ + ψ B Hψ dx
∂t ∂t iℏ iℏ
R R
Now since
R H is HermitianR we use eq. 46 so that (Hψ)∗ ϕdx = ψ ∗ Hϕdx with ϕ = Bψ
to get (Hψ)∗ Bψdx = ψ ∗ HBψdx. We also assume B to be time independent. Thus
the above expression simplifies to
Z Z
∂ 1 ∗ ∗ 1
⟨B⟩ = [ψ BHψ − ψ HBψ] dx = ψ ∗ [BH − HB] ψdx
∂t iℏ iℏ
Defining the commutator [B, H] = BH − HB, we get the Heisenberg equation of motion:
∂ 1
⟨B⟩ = ⟨[B, H]⟩ (50)
∂t iℏ
Here are a few consequences and remarks related to this equation:
1) When [B, H] = 0 we get ∂t ∂
⟨B⟩ = 0, i.e. B is a constant of motion. Thus the oper-
ators that commute with Hamiltonian are important as they define constants of motion.
Usually such an operator describes certain symmetry of the system (or Hamiltonian). It
often helps to find such operators first to solve for eigen-states of H. Since H commutes
with itself so energy expectation value is a constant of motion.
2) We can also prove that the expectation value of any operator over an eigen-state
of H is independent of time.
3) Another consequence of commutator being zero is one can always find functions
which are simultaneous eigen-functions of such commuting operators. This, as we shall
see, helps in accounting for the degenerate eigen-states of H.
34
jump at a point. These potentials are only to be looked at as limiting cases of real
potentials and dψ/dx is never discontinuous for physically meaningful potentials. The
discontinuity of dψ/dx at such points can be easily understood if one writes TISE as,
d2 ψ 2m[V (x)] − E]
= ψ(x).
dx 2 ℏ2
We can see that when V (x) jumps to ∞ at a point for finite E, d2 ψ/dx2 is infinite at such
points which indicates a discontinuity in dψ/dx.
Going with the general strategy stated above ψ(x) must be zero for x < 0 and x > L.
For 0 < x < L, we have
d2 ψ 2mE
= − 2 ψ(x) = −k 2 ψ(x)
dx 2 ℏ
with k 2 = 2mE/ℏ2 as a positive quantity. One cannot find any meaningful solutions for
E < 0. The solution for this is ψ(x) = A sin kx + B cos kx. For ψ(x) to be continuous,
we should have:
(i) ψ(0) = 0, which leads to B = 0,
(ii) ψ(L) = 0 implying sin kL = 0 as A ̸= 0 for non-trivial solution. This implies
kL = nπ. RL
Thus we get ψ(x) = A sin (nπx/L). To normalize ψ(x), we have |A|2 0 sin2 (nπx/L)dx =
p
1, which leads to |A| = 2/L. As stated earlier the phase of A cannot be determined and
we take it as zero as a convention. We also see that values of k are quantized in integer
multiple of nπ leading to permitted energy values as En = n2 π 2 ℏ2 /2mL2 . To summarize:
r nπx
2 n2 π 2 ℏ2
ψn (x) = sin and En = (51)
L L 2mL2
Here are a few remarks on this problem:
1) Energies are discrete, which is different from the classical expectation of continuous
energies.
2) Minimum energy, i.e. π 2 ℏ2 /2mL2 is non-zero which is different from the classical
expectation of zero.
3)En ∝ n2 so (En+1 − En ) ∝ n,R i.e. energy separation increases with n
4) ψn (x) are orthonormal, i.e. ψn∗ (x)ψm (x)dx = δnm with δnm is Kronecker delta.
5) The plots of ψn (x) has n − 1 nodes (other than the two extreme ones). This is a
general feature of bound states, i.e. as E increases number of nodes increase.
p
for ψn (x) work out as ∆pn = nπℏ/L and ∆xn = L (1/12 − (1/2n2 π 2 )).
6) The uncertainties p
Thus, ∆pn .∆xn = nπℏ (1/12) − (1/2n2 π 2 ). So uncertainty product increases with n.
For n = 1, ∆pn .∆xn = 0.57ℏ > ℏ/2
35
7) Since V (x) is symmetric about x = L/2 so ψn are either symmetric or antisymmetric
about x = L/2. This happens as H commutes with the ‘inversion about L/2’ operator
and there is no degeneracy.
8) One can write any general wave-function that vanishes at x = 0 and x = L as
a linear superposition of ψn (x). This can be seen easily here using the idea of Fourier
series. Let’s discuss an example wave-function: ψ(x) = A sin3 πx/L. Here, A is the
normalization
√ factor. We can easily see that the normalized wave-function is given by:
ψ(x) = (1 10)[ψ3 (x) − 3ψ1 (x)] by using identity sin 3θ = 3 sin θ − 4 sin3 θ. Thus for this
case the probability to find the particle in E1 and E3 states will be 9/10 and 1/10, respec-
tively. This also leads to an energy expectation value of (9E1 + E3 )/10 = 9π 2 ℏ2 /10mL2 .
Simulating a classical particle: Clearly, the above stationary eigen-states are far
from classical behavior. A natural question then is: how do we construct a classical-like
solution using these eigen-state wave-functions? A classical particle in such box-potential
is expected to elastically bounce from the two hard walls of the potential well with constant
speed in-between. We construct a superposition wave-function which is localized at a given
time and its time evolution is consistent with the classical expectation. Here is such a
wave-function:
X
N +(∆N/2)
nα ℏπ 2
ψ± (x, t) = sin (nπx/L) exp (±inaπ) exp −i t .
2mL2
N −(∆N/2)
Here α = 2. This actually gives a wave packet centered at x = a and moving to the left or
right, depending on the sign chosen in the middle exponential term. We choose the unit
of time as τ = (4mL2 /h = 1. This makes the exponent of the last exp term as −iπnα t/τ .
For L = 1 cm and m = 1 gm we get τ ∼ 6 × 1026 . The center of the wave function moves
at speed vN = N ℏπ/mL. This just means that the magnitude of speed v0 corresponding
to N = 1, i.e. v0 = ℏπ/mL, is extremely small. For above parameters (1 gm and 1 cm)
we get v0 = 3.3 × 10−29 m/s. So for a classical particle to move at noticeable speed, say
10−3 m/s, the principal quantum number (N ) will have to be ∼ 3 × 1025 , which is huge.
The parameter ∆N is the spread in N corresponding closely to uncertainty in momen-
tum. This is required to make a localized wave-packet representing a classical particle. In
fact the magnitude of the width of the wave-packet in real-space is proportional to 1/∆N .
As time passes this wave-packet’s evolution consists of its motion with reflection at the
walls and dispersion (spread). The time scale of wave-packet motion over the width of
the well is given by L/N v0 while time taken to disperse it in the whole well would be
∼ L/(∆N v0 ). The latter can be very large as compared to the former if ∆N/N is very
small, which would be the case for a classical particle.
All the above features, namely, motion, reflection, dispersion, etc., can be easily
demonstrated with the above wave-packet using Mathematica. One can also see the
effect of various parameters on motion and dispersion. It is worthwhile to notice how the
small wavelength (or sharper) features move faster than the large wavelength components.
It is also interesting to see the effect of the value of α. The wave-packet becomes
non-dispersive for α = 1. Further one can see the change in the nature of dispersion
when α is changed from 1.01 to 0.99. In this case one can see how the sharp features lead
in the former case while the broad tail legs and vice versa for the latter case. It is also
interesting to see the time evolution from a negative t.
36
In the Mathematica code one has to run the first half of the code to create the matrix
that stores the complex time dependent wave-packet at different times. This is computa-
tion intensive and for large ∆N it takes longer as the series to be summed has more terms.
After this matrix is created one can run the animation part to get the time evolution of
the wave-function.
Figure 18: Piecewise constant potential V (x) with a bound state of eigne-energy E.
The bound states will exist for energy E values which are between the minimum of
potential and the maximum or the potential at infinity. For some x-ranges V (x) < E while
for others V (x) > E. These two types of regions will have different types of solutions.
Suppose V (x) = V1 < E for x1 < x < x2 then we get the TISE as,
2m 2m
ψ ′′ = − (E − V 1 )ψ(x) = −k 2
ψ(x) with constant k 2
= (E − V1 ) > 0.
ℏ2 1 1
ℏ2
Thus we can write the solution for this x-range in three equivalent ways with two un-
knowns,
ψ(x) = A1 exp(ik1 x) + B1 exp(−ik1 x) OR
ψ(x) = C1 sin k1 x + D1 cos k1 x OR
ψ(x) = F1 sin (k1 x + δ1 ).
For region, say x2 < x < x3 , V (x) = V2 > E then we get the TISE as,
2m 2m
ψ ′′ = (V2 − E)ψ(x) = κ22 ψ(x) with constant κ22 = 2 (V2 − E) > 0.
ℏ2 ℏ
37
In such regions we can write the solution in the following three equivalent ways with two
unknowns,
ψ(x) = A2 exp(κ2 x) + B2 exp(−κ2 x) OR
ψ(x) = C2 sin κ2 x + D2 cos κ2 x OR
ψ(x) = F2 sin (κ2 x + δ2 ).
Here A1 , A2 , B1 , B2 ,... etc. are complex constants and δ1 , δ2 are real constants. These are
to be found from the boundary conditions and normalization conditions. In general one
of these constants cannot be found from boundary conditions while others can be found
in terms of this one unknown, which has to be found from the normalization condition.
One can also choose to work with either k or κ for all regions and in this case these can
admit imaginary values also. With these one has to satisfy the boundary conditions at
the boundaries between various constant potential regions. For the above case, one such
boundary is x = x2 . Each boundary leads to two equations from the continuity of ψ(x)
and ψ ′ (x).
There are three remarks that need to add:
1) For the bound states, the wave function for far left, i.e. x → −∞, must be either
zero, if the potential is infinite in this region or if it is finite then it’ll have a form exp(κL x)
with appropriate κL . Similarly for far right, i.e. x → −∞, the solution can be either zero
or of form exp(−κR x). Also if V (x) becomes infinite in certain regions then ψ(x) must
vanish in such regions.
2) zero of energy: We have the freedom to choose the zero of energy and potential,
as long as we choose the same for both. Different problems may make a particular choice
of this zero and one choice may be more convenient than others. This change in choice of
zero offsets both V (x) and E equally keeping their difference the same. Thus the TISE
and its solutions remain unchanged.
3) In case the potential is symmetric about some point, say x = 0, then the solutions
of TISE can always be chosen to be either symmetric [i.e. ψ(−x) = ψ(x)] or anti-
symmetric [i.e. ψ(−x) = −ψ(x)]. In 1D the bound states are always non-degenerate and
then the associated wave-functions are always either symmetric or antisymmetric. The
scattering states are usually degenerate and thus one can choose non-symmetric solutions
if convenience dictates. This aspect will become more clear when we discuss the linear
vector spaces and symmetry operators.
38
Figure 19: Symmetric finite well potential V (x) of width a and depth V0 .
This is clearly a piecewise constant potential and following the general strategy dis-
cussed above, we write:
2mE
For |x| < a/2, ψ ′′ (x) = − ψ(x) = −k 2 ψ(x) and
ℏ2
2m(V0 − E)
for |x| ≥ a/2, ψ ′′ (x) = ψ(x) = −α2 ψ(x).
ℏ2
Here,
r r
2mE 2m(V0 − E)
k= and α = (52)
ℏ2 ℏ2
Thus, we get for the three regions-I, II and III as marked in the figure,
Note that for bound states, ψ = 0 as x → ±∞. Also since the potential here is symmetric
about x = 0 we can choose the wave-functions to be either symmetric or anti-symmetric
about x = 0. Below we discuss these two types separately.
Symmetric Solutions: The even parity (or symmetric) solution is ψ(x) = A1 cos kx
for |x| < a/2 and ψ(x) = C1 e−α|x| for |x| ≥ a/2. The continuity of ψ and ψ ′ at x = a/2
lead to,
ka
A1 cos = C1 e−αa/2
2
ka
−kA1 sin = −αC1 e−αa/2 (53)
2
The boundary conditions at the other boundary, i.e. x = −a/2, lead to identical equations.
Dividing these we get the relation, k tan ka
2
= α. Defining η = ka/2 and ξ = kα/2 we get
ξ = η tan η (54)
39
Using Eq.52, we also get
2mV0 a2
η2 + ξ 2 = (55)
2ℏ2
These last two equations give solutions for η and ξ or in other words for k and α. These
lead to allowed E values. These equations are transcendental and not analytically solvable.
One can solve them numerically or one can describe the graphical solutions which is quite
insightful. After knowing an allowed E value one can use one of the relation in Eq. 53 to
obtain C1 in terms of A1 while A1 can be found from the normalization condition. Before
discussing the graphical solutions, let’s first discuss the anti-symmetric solutions.
Antisymmetric Solutions: The odd parity (or antisymmetric) solution consists of
ψ(x) = C2 eαx for x < −a/2, ψ(x) = A2 sin kx for |x| < a/2, ψ(x) = −C2 e−α|x| for
x > a/2. The continuity of ψ and ψ ′ at x = a/2 lead to,
ka
A2 sin = C2 e−αa/2
2
ka
kA2 cos = −αC2 e−αa/2 (56)
2
Dividing these we get the relation, −k cot ka
2
= α and with η = ka/2 and ξ = kα/2 we
get
ξ = −η tan η (57)
As discussed for the symmetric case, this and Eq. 55 have to be solved simultaneously to
obtain allowed E values and the unknowns A2 and C2 can be found in a similar way.
Figure 20: Graphical solutions for η for a symmetric finite well potential. The black lines
diverging at π/2, 3π/2, etc. correspond to ξ = η tan η and the red lines diverging at π, 2π
correspond to ξ = −η cot η. The black circular arcs correspond to ξ 2 + η 2 = mV0 a2 /2ℏ2
for different V0 values.
40
Graphical Solutions for E: Figure 20 shows the plots of Eqs. 54, 57 and 55. The
last equation is plotted for different values of V0 . We can find the solutions for η and thus
E for a given V0 from the intersection points of the circles with the two transcendental
equations. We choose to plot these equations in first quadrant of η − ξ plane. This is
justified as we defined k and α as positive square-roots.
Finally, a few remarks on this problem:
1) ξ = η tan η is quadratic near first zero, i.e. η = 0, but it is linear with increasing
slope near other zeroes. In fact, the slope at the η = nπ zero is nπ.
2) We see that for (mV0 a2 /2ℏ2 ) < (π/2)2 there is no odd parity solution possible.
However for even parity there is at least one possible solution for arbitrary V0 . So there
is at least one bound state for any 1D symmetric potential well.
3) For V0 → ∞ the circle ξ 2 + η 2 = mV0 a2 /2ℏ2 intersects the ξ = η tan η lines
at η = π/2, 3π/2, 5π/2... and it intersects the ξ = −η cot η lines at η = π, 2π, 3π....
Combining the two we get, for allowed k values, ka = 2η = nπ, which leads to the
energies same as that of infinite well potential as expected.
4) The wave function for the lowest energy symmetric and antisymmetric bound states
are shown in Fig.21. Clearly the ground state is described by the symmetric wave-function.
These can be easily obtained by finding Ai /Ci ratio (Eqs. 56 and 53) after we know the
admissible η and ξ values from the eigen-energies. The other unknown is dictated by the
normalization condition.
Figure 21: Symmetric finite well potential V (x) of width a and depth V0 .
5) Another interesting limit of this problem is when this potential approaches a neg-
ative δ-function potential. This can be realized by taking the limit V0 → ∞ and a → 0
such that V0 a = λ, i.e. a constant. We also need to shift the zero of energy to V0 so
R ϵ the potential is −V0 for |x| < a/2 and it is zero otherwise, see Fig. 22. This way
that
−ϵ
V (x)dx = −λ. With this there will be exactly one bound state as the radius of the cir-
cle ξ 2 + η 2 = mV0 a2 /2ℏ2 = mλa/2ℏ2 will approach zero in the δ-function limit. This will
be a symmetric bound state pso we need to look at the intersection point of ξ = η tan η with
a diminishing radius, i.e. mλa/2ℏ2 , circle. Also with the new choice for zero √ of energy,
the bound state p will have negative energy, say −E0 , such that ξ = αa/2 = 2mE0 a2 /ℏ
and η = ka/2 = 2m(E0 − V0 )a2 /ℏ.
For small η, η tan η = η 2 so we need to find intersection of ξ = η 2 with the diminishing
41
radius circle ξ 2 + η 2 = mλa/2ℏ2 . This gives ξ 2 + ξ = mλa/2ℏ2 and in small ξ limit
we omit the quadratic term to get ξ = mλa/2ℏ2 or 2mE0 a2 /ℏ2 = m2 λ2 a2 /4ℏ4 leading to
E0 = mλ2 /2ℏ2 . Thus the bound state energy is −mλ2 /2ℏ2 . Furthermore, the bound state
wave-function works out as ψ(x) = Ae−α|x| with (α = mλ/ℏ2 ). We also see that the wave-
function derivative is discontinuous for this wave-function at x = 0. This discontinuity is
given by ∆ψ ′ |x=0 = −A(2mλ/ℏ2 ). One can also solve the δ-function bound state problem
directly by finding this discontinuity starting from the TISE. This is to be done as a HW.
Figure 22: Negative δ-function limit of the finite well potential. The middle plot illustrates
the intersection of ξ = η 2 and ξ 2 + η 2 = mλa/2ℏ2 circle with diminishing circle radius.
The right most plot is the wave function corresponding to the only permitted bound state.
42
physically permitted wave-function made by superposition of such eigen-state solutions
must be normalizable and thus it should vanish at ∞.
Figure 23: Step potential with a particle wave striking with E < V0 .
Let’s consider the step potential as shown in Fig. 23 and given by,
(
0 for x < 0
V (x) = .
V0 for x > 0
We consider two relevant cases separately, E < V0 and E > V0 . The V0 < 0 scenario is
also effectively covered by these two cases. We look for solution where a particle wave
with wave-vector +k and energy E = ℏ2 k 2 /2m is incident from left. This wave can get
partially reflected and partially transmitted, in general.
Case I: 0 < E < V0 The acceptable solution of TISE for this case is given by,
(
Aeikx + Be−ikx for x < 0
ψ(x) = .
De−αx for x > 0
Here Aeikx represents the incident wave and Be−ikx is the reflected one. On the right of
the step is the classically forbidden region which extends to x → ∞ thus eαx is ruled out.
Note also k 2 = 2mE/ℏ2 and α2 = 2m(V0 − E)/ℏ2 . The continuity of ψ and ψ ′ at x = 0
leads to two equations:
A + B =D
iα
ik(A − B) = − αD or A−B = D.
k
By adding and subtraction these two equations we get
D iα D iα
A= 1+ and B = 1− .
2 k 2 k
Thus, in terms of A, we get
2k k − iα
D=A and B = A
k + iα k + iα
43
This leads to the final wave-function,
(
Aeikx + A k−iα
k+iα
e−ikx for x < 0
ψk (x) = .
2k
A k+iα e−αx for x > 0
Here, E, k and α take continuous values. One can superpose many such different k
wave-functions to create localized particle-like states. The time evolution of such wave
function is easily captured by incorporating the phase factors exp (−itℏk 2 /2m) in the
superposition sum. This was P illustrated in the lecture by looking at the time evolution of
the wave-function ψ(x, t) = k Ak ψk (x)e−itℏk /2m with |Ak | = exp [−(k − k0 )2 /δk 2 ]. In
2
Figure 24: Step potential with a particle wave striking with E > V0 .
(
A1 eikx + B1 e−ikx for x < 0
ψ(x) = ′ .
D1 eik x for x > 0
Here A1 eikx represents the incident wave and B1 e−ikx is the reflected one. On the right of
the step is the transmitted wave.
We can again write the boundary conditions at x = 0 and get linear equations to find
B1 and D1 in terms of A1 as we did earlier. This is easily achieved after we recognize
that all the steps of Case-I are repeated after replacing α by −ik ′ . With this we get the
following wave-function for this case,
( ′
A1 eikx + A1 k−k
k+k′
e−ikx for x < 0
ψk (x) = 2k ik′ x
.
A1 k+k′ e for x > 0
44
′ 2
Thus we get SI = ℏk m
|A1 |2 , SR = − ℏk (k−k )
m (k+k′ )2
|A1 |2 and ST = ℏk 4k2
m (k+k′ )2
|A1 |2 . You can easily
verify that |ST | + |SR | = |SI |. Finally, we get
ST 4k ′ k |SR | (k − k ′ )2
T = = and R = = = 1 − T.
SI (k + k ′ )2 |SI | (k + k ′ )2
For V0 we can easily extrapolate the Case-II with the plane wave incident from right. The
variation of reflectance and transmittance with energy is shown in Fig.25.
Figure 26: The quantum tunneling across a barrier of height V0 and width a.
ikx −ikx
Ae + Be for x < 0
ψ(x) = Ceαx + De−αx for a > x > 0 .
ikx
Fe for x > 0
45
Here k 2 = 2mE/ℏ2 and α2 = 2m(V0 − E)/ℏ2 . The boundary conditions dictate the ψ and
ψ’be continuous at x = 0 and x = a and this leads to the following four linear equations:
A+B =C +D
α
A − B = (C − D)
ik
αa −αa
Ce + De = F eika
ik
Ceαa − De−αa = F eika
α
One has to solve these to find B, C, D and F in terms of A in order to find the complete
wave function. For finding the transmittance which corresponds to the probability of
tunneling for a given particle we need to mainly find F . One can solve these by various
linear manipulations. For instance, addition and subtraction of the last two equations
leads to C and D in terms of F . Addition of first two eliminates B and then one can use
C and D found earlier to find a relation between F and A. This works out as,
2kαe−ika
F =A
2kα cosh αa + i(α2 − k 2 ) sinh αa
The others, i.e. C, D, and B, can be worked work out in terms of A using
1 ik 1 ik
C = F e−αa (1 + )eika and D = F eαa (1 − )eika and B = C + D − A
2 α 2 α
The relation between F and A can be used to find the transmittance T = ST /SI =
|F |2 /|A|2 as
" 2 #−1
1 α k
T = cosh2 αa + − sinh2 αa
4 k α
In above E > V0 is also an interesting case of transmission over a barrier. This can
easily be worked out by replacing α by ik ′ with k ′2 = 2m(E − V0 )/ℏ2 . This leads to the
expression for transmission as,
−1
(k − k ′ )2 2 ′
T = 1+ sin k a .
(k + k ′ )2
46
We see that T = 1 for k ′ a = nπ, i.e. full transmission as shown in Fig. 27. This is a
resonance condition arising from the constructive interference between waves scattered
at x = 0 and x = a. This is similar to the physics of anti-reflection coatings where one
chooses a thickness of the coating and its refractive index such that k ′ a = nπ leading to
zero reflection. This kind of resonant transmission is seen in scattering of electrons by
atoms and nuclei.
In realistic situations the barrier is never rectangular and solving a general tunneling
problem with arbitrary V (x) is not possible. In certain limits, one cam make a useful
approximation called WKB approximation. This is named after three physicists: Wentzel,
Kramers and Brillouin. According to this approximation the tunneling probability across
a potential barrier described by V (x) is given by,
" Z r #
x2
2m
P ≈ exp −2 (V (x) − E)dx . (59)
x1 ℏ2
Here, x1 and x2 are the classical turning points and while the region between x1 and x2
Figure 28: Tunneling across an arbitrary shaped barrier with classical turning points as
x1,2 dictated by energy E.
as shown in Fig. 28 is classically forbidden. Naively, one can justify the above expression
47
using Eq. 58 and by dividing the barrier into regions of width dx between the two turning
points. The overall tunneling probability will then be a product of the probability of
tunneling across each of these width dx regions where potential is V (x).
Figure 29: The left figure shows the schematics of the barrier and the electron energies
between two metal surfaces at distance a and with a potential difference V . The latter
lowers the Fermi energy of one metal relative to the other by energy eV . The right figure
shows the schematics of a STM with a sharp metal tip and a conducting sample surface.
In more detail, one actually uses a very sharp metal (gold or platinum) tip and the
flat metal sample surface for STM as depicted in Fig. 29. A potential difference of 100
mV or less is applied between the tip and sample to slightly miss-align the Fermi energies
of the two metals so that there is a biased flow due to tunneling between the two. At
zero potential difference equal number of electrons tunnel at finite temperature between
the two metals resulting into zero current. Also since electrons obey the Pauli exclusion
principle the electrons can only tunnel from the filled state of one electrode to the empty
states of the other electrode. At a fixed bias the tunnel current is directly proportional
to tunneling probability which is exponentially dependent on the tip-sample separation.
Thus when the tip is scanned over the surface which has some height variations the tunnel
current reflects the local surface height (z) and thus one can obtain a topographic image
of the surface. The exponential dependence of tunnel current on ‘a’ also helps in achieving
very good lateral (xy) resolution. In a sharp metal tip one atom sticks out the most and
48
other atoms that are behind this one by 0.5Åwill contribute much smaller tunnel current.
Figure 30: Potential barrier formed by strong forces and Coulomb repulsive forces between
two nuclei.
The tunneling theory of α-decay was given by George Gamow for which he got Nobel
prize. In this theory it was assumed that protons and neutrons are bound to each other
through the strong forces which are effective only at very small distances of about a Fermi,
which is same as Femto-m or 10−15 m. Beyond this separation the interaction between
different nuclei is dominated by the repulsive Coulomb potential energy which is given by
Z1 Z2 e2
V (r) =
4πϵ0 r
with Z1,2 as the atomic numbers of the two nuclei and r as their separation. In this theory
it is assumed that the α-particle with energy E, same as that of the emitted α-particle,
attempts to tunnel across the barrier formed by the strong and Coulomb forces as shown
in Fig. 30. The Coulomb potential here would correspond to V (r) with Z1 = 2 and
Z2 = Z − 2. This is depicted in Fig... One does not know much about the dependence
of strong interaction on r but these interactions lead to a large negative potential drop
in an extremely short distance ≤ 1 fm. The range of Coulomb potential is much longer.
49
Thus we see that classical turning point corresponding to strong force, i.e. R ∼ 1 fm.
The other turning point r1 can be estimated for Z2 = 84, Z1 = 2 and for E = 4 MeV by
using
Z1 Z2 e2 Z1 Z2 e2
V (r1 ) = = E or r1 = (60)
4πϵ0 r1 4πϵ0 E
e
This gives: r1 = 40 fm. It is useful to know that 4πϵ0 R
= 0.96 MeV for R = 1 fm. The
barrier height would be,
Z1 Z2 e2
∆V = V (R) = (61)
4πϵ0 R
This for Z2 = 84, Z1 = 2 works out to be about 160 MeV, which is much higher than E.
Therefore in order to find the rate at which an α-particle gets emitted from a given
nucleus we need to find 1) the probability P of tunneling across the barrier depicted in
Fig... and 2) the rate at which the α-particle strikes the barrier from inside the parent
nucleus. For finding P we use Eq. 59 with the turning points R and r1 to get,
" √ Z r1 p # " √ Z r1 s #
2m 2m Z1 Z2 e2
P = exp −2 (V (r) − E)dr = exp −2 − E dr
ℏ R ℏ R 4πϵ0 r
" √ Z r # " √ #
2mE r1 r1 r1 2mE
= exp −2 − 1 dr = exp −2 I .
ℏ R r ℏ
Here m is the mass of the α-particle. In above we have also used Eq. 60. Also the
integral-I is given by,
Z r Z 1 p
1 r1 r1
I= − 1 dr = (x−1 − 1)dx
r1 R r R/r1
Z 1p Z R/r1 p
= (x−1 − 1)dx − (x−1 − 1)dx
0 0
The first term in above can be evaluated by using x = sin2 θ. It works out as π/2. The
second term can be found with the approximation R/r1 << 1 leading to x1 − 1 ≈ x1 over
the range of integration. This eventually leads to
r
π R
I = −2
2 r1
Thus we finally get,
" √ r !#
r1 2mE π R
P = exp −2 −2 .
ℏ 2 r1
50
√ √ p
2mZ1 Z2 e2 2mZ1 Z2 e2 R
Here B = 4ϵ0 ℏ
and ln C1 = 4 √4πϵ 0ℏ
= 4R 2mV (R)/ℏ2
The strike rate or attempt rate of the α-particle inside the nucleus can be estimated
as v/2R where v is its speed leading to the net emission rate as (v/2R)P . Thep lifetime,
−1
which is the inverse of emission rate is given by τ = (2R/v)P . We expect v = 2E/m.
Thus we can write,
2R −B/√E
τ= e
vC1
√
2R m B log e
or log τ = log √ − log C1 + √
2E E
In above the E dependence of the last term is the most drastic as compared to the E
dependence of the first term. Basically v does not change significantly over the energy
range of interest from 4 to 10 MeV. Moreover the Z dependence of B, C1 or R is much
weaker. Thus we write,
B log e
log τ = C2 + √ (62)
E
√
Thus we use a typical Z = 86 to obtain B log e ≈ 150 MeV and C2 ≈ −53 for τ in
seconds.
The success of this theory becomes clear when one looks at the experimentally found
τ and E for a variety of nuclei and compares it with the theoretical plot as per Eq. 62.
This is shown in Fig. 31.
7 Nuclear fusion
The nuclear fusion involves same physics except this time two positively charged nuclei
have to come within the range of strong forces to bind with each other. The nuclei have
to again overcome the Coulomb barrier given by Eq. 61. For instance if we consider the
fusion reaction between two protons, i.e.
p + p → D + e+ + νe ,
the energy barrier (Z1 = Z2 = 1) works out to be about 1 MeV. Such fusion reactions
actually occur in the Sun whose core temperature is 106 K and surface temperature is
6000 K. The corresponding thermal energies (i.e. kB T ) are about 100 eV and 0.5 eV.
Thus the probability of overcoming the barrier by thermal activation, i.e. e−∆V /kB T in
the core of the Sun would be about e−10000 . If the two nuclei approach each other with
kinetic energy (in center
√ of mass frame) E = kB T the tunneling probability, according to
−100
Eq. 62 (with B ≈ 1 MeV) will be about e . We thus see that these nuclear reactions
in the Sun actually occur due to quantum tunneling.
51
Figure 31: Experimentally measured half-lives of different nuclei as function of emitted
α-particle energy E. The dotted
√ line shows the expectation as per the tunneling theory
Eq. 62 with
√ B log e = 148 MeV and C2 = −53.5. Note the log scale on vertical axis
and −1/ E scale on horizontal axis. Also note that half life and decay time τ differ by
a factor ln 2.
is an abstract entity and we choose a set of axes, or bases vectors, in order to work with
vectors mathematically. The exact mathematical form of a given vector depends on the
bases (or axes) that we choose in 3D space. These same concepts will be generalized for
quantum mechanics as LVS plays a very important role in QM. To define a LVS formally:
52
−
→ − → − → − →
it is a set of elements, say {V1 , V2 , V3 , V4 . . . }, satisfying the following
−
→ − → −
→ −
→
1. If V1 & V2 are two vectors, i.e. elements of this set, than λ1 V1 +λ2 V2 is also a vector.
Here λ1 , λ2 are real numbers and so this LVS is called real LVS. We similarly have
complex LVS also.
−
→ − →
2. There is a scalar product defined, i.e. V1 · V2 , which is a scalar, i.e. a real number
in case of real LVS. We know what this scalar product means for a LVS of vectors,
i.e. it is the scalar or dot product of two vectors. Also this scalar product satisfies
a few properties, namely,
−
→ − →
(a) The scalar product is linear with respect to both V1 & V2 , i.e.
−→ −
→ − → −→ − → −
→ − →
(λ1a V1a + λ1b V1b ) · V2 = λ1a V1a · V2 + λ1b V1b · V2 . (63)
−
→
Similarly it is linear wrt V2 .
−
→ − → −
→ −
→ −
→ −
→
(b) If V1 · V2 = 0 than either V1 or V2 is zero, i.e. null vector, or V1 & V2 are
orthogonal, i.e. mutually perpendicular in case of vectors in 3D space.
−
→ − → −
→
(c) V · V = 0 if and only if V is a null vector. The norm of the vector, i.e.
−
→ −
→ −
→ −
→ − →
|| V || (or simply | V |), gets defined as || V ||2 = V · V . Norm represents the
magnitude of the vector and it is a non-negative real number.
(d) Schwarz inequality: The scalar product satisfies
−
→ − → −
→ −
→
(V1 · V2 )2 ≤ ||V1 ||2 ||V2 ||2 . (64)
−
→ − → −
→ − →
Here the equality holds if V1 ∝ V2 , i.e. if V1 & V2 are parallel or co-linear.
With the scalar product defined one can also define the angle between two vectors and
this can be generalized to other LVS as well. These vector spaces are also called Hilbert
spaces, which is a common terminology for any quantum system as the Hilbert space
encompasses all possible states wrt to a particular degree of freedom such as the spin
degree of freedom or spatial (3D) degree of freedom.
with δij as the Kronecker delta, i.e. δij = 1 if i = j and zero otherwise. Now any vector,
−
→ −
→ P
V can be written as V = Vi ub1 + V2 ub2 + V3 ub3 = i V1 ubi . Representability of any vector
(or element of the LVS) in terms of ubi is very important for an acceptable basis-set. It is
also called completeness of the bases or the closure condition. Here Vi is a number and it
53
−
→
represents the ith component of V . These {Vi }, i.e. their values, depend on which bases
we choose. In 3D we are free to choose any set of three orthogonal axes and corresponding
unit vectors as bases. The number of independent basis vectors required for completeness
is called the dimensionality of the LVS, which is 3 for LVS of vectors in 3D.
−
→
After choosing this representation (or bases), a particular component of V can be
written as linear superposition of components along the basis vectors, which are the scalar
−
→
products with corresponding basis vector, i.e. Vi = V · ubi . In terms of these components,
We can write the scalar product between two vectors as
−
→ −→ X X X
U ·V = Ui ubi · Vj ubj = Ui Vj δij = Ui Vi . (66)
ij ij i
−
→ −
→ −
→ − →
Furher, if we look at U and V as two column matrices than the scalar product U · V
−
→ − →
is equivalent to the matrix product ( U )T V where the first one is the transpose of the
−
→
column matrix U giving a row matrix. This’ll also help us appreciate the Dirac notation
discussed later. This can be looked at more explicitly as,
−
→ − → X V1 −
→ − →
U ·V = Ui Vi = U1 U2 U3 V2 = ( U )T V
i V3
While we all are familiar with above stuff for LVS of vectors in 3D space, it’s useful
to keep this in mind as a specific example of LVS. This gives some intuition and and
it’s good for remembering certain relations for other LVS that are more complicated and
indispensable in QM.
2. Operators’ product satisfies the associative law, i.e. (AB)C = A(BC) but not the
commutative law (in general), i.e. AB ̸= BA. There may be some special operators
that commute with each other and this is of special significance in QM.
54
To write an operator A in concrete mathematical form using certain bases {ubi } we analyze
−
→ → P
− −
→ P
its operation on a given vector V in the same bases. Suppose V = Vi ubi , V ′ = Vi′ ubi
−
→ − → −
→
and A V = V ′ . Taking scalar product of the last relation with ubj , we get ubj · (A V ) =
−
→ P P
ubj · V ′ = Vj′ . This gives Vj′ = ubj · (A i Vi ubi ) or Vj′ = i (ubj · Aubi )Vi . Thus we see that we
can find the result of A operating on any given vector if we know the entities {ubj · Aubi }
corresponding to a given operator P A. These entities are called matrix elements of A, i.e.
Aji = ubj · Aubi and so we get Vj′ = i Aji Vi .
It is more convenient to work with matrix form of operators and vectors. With the
−
→ −
→
matrix elements defined above, we can write V ′ = A V as
′
V1 A11 A12 A13 V1
V2′ = A21 A22 A23 V2 (67)
V3′ A31 A32 A33 V3
The above defines the matrix forms of vectors, i.e. elements of LVS, and operators after
choosing a bases, i.e. a complete set of basis vectors. One can also define general matrix
−
→ − →
element of anPoperator as V1 · AV2 which can be written in matrix form and eventually it
simplifies to ij V1j Aji V2i . We should realize that the “linearity” of operators and vectors
plays an important role in all these simplifications.
55
gives the vector in the new basis. We can see that for real LVS the transformation matrix
S is a real matrix with elements Ski = vbk · ubi as the scalar products between the elements
of the two bases. Given the orthonormality of the two bases, we can also verify that SS T
is a unit matrix,
P T i.e.′ S is unitary. Thus one can also write the inverse transformation,
T
i.e. VPk = S
i ki i V . Here S is the transpose of S. Using this S we can also write
vbk = j ubi (ubi · vbk ), i.e.
X
vbk = Ski ubi . (70)
j
Similarly for operator A, in {ubi } bases its matrix elements are {Aij } while in {vbi }
′
bases these are {Aij }. We have to find how the two P are relatedP to each P other. For this
we use the above Eq.70 to write, A′kl = vbk · Ab vl = i Ski ubi · A j Slj ubj = ij Ski Aij Slj =
P
ij Ski Aij Sjl . Thus in the {v
bi } bases the operator A will be given by
T
A′ = SAS T . (71)
We see that the change of bases is captured, for both the vectors and operators, by
the unitary matrix S and thus this is also called a unitary (or similarity, for real LVS)
transformation.
−
→ − →
We can also see that a scalar expression, such as V1 · AV2 works out same no matter
which bases is chosen. For this it is convenient to look at the scalar product in the
−
→ − → −
→ −
→
matrix form. V1 · AV2 in the second bases will be V1 ′ · A′ V2 ′ , which using matrix notation
−
→ −
→
will give (V1 ′ )T A′ V2 ′ . Using the change of bases transformation matrix S, this gives
−
→ −
→ −
→ −
→ − → − → − → − →
(S V1 )T (SAS T )(S V2 ) = V1 T S T SAS T S V2 = V1 T AV2 = V1 ·AV2 . This uses unitary property
of S. This illustrates that such scalar expressions are independent of bases. It is useful
to work out some such examples to gain some experience.
1. If ψ1 (x) and ψ2 (x) are two valid wave-functions, i.e. elements of this LVS, than
c1 ψ1 (x)+c2 ψ2 (x) is also a valid wave-function. Here c1 and c2 are complex numbers.
The notation used on left of this equation for scalar product is called Dirac notation
and we’ll discuss this in more detail later. The scalar product satisfies the following
56
(a) This scalar product is linear wrt ψ2 and it is anti-linear wrt ψ1 , i.e. ⟨(c1a ψ1a +
c1b ψ1b )|ψ2 ⟩ = c∗1a ⟨ψ1a |ψ2 ⟩+c∗1b ⟨ψ1b |ψ2 ⟩ and ⟨ψ1 |(c2a ψ2a +c2b ψ2b )⟩ = c2a ⟨ψ1 |ψ2a ⟩+
c2b ⟨ψ1 |ψ2b ⟩
(b) If ⟨ψ1 |ψ2 ⟩ = 0 then either ψ1 (x) or ψ2 (x) is zero or ψ1 (x) and ψ2 (x) are or-
thogonal.
(c) The norm of a wave function ψ(x) gets defined by using the scalar product as
Z ∞
||ψ|| = ⟨ψ|ψ⟩ =
2
|ψ(x)|2 dx. (73)
−∞
Thus the norm is real and non-negative with well defined square-root, which
gives the overall probability in QM.
(d) The scalar product satisfies the Schwarz inequality: |⟨ψ1 |ψ2 ⟩|2 ≤ ||ψ1 ||2 ||ψ2 ||2 .
One can easily generalize the notion of linear operators to this LVS as well. An operator
is still a linear one-to-one mapping from one wave-function to another, i.e. ϕ(x) = Aψ(x).
The linearity implies A[c1 ψ1 (x) + c2 ψ2 (x)] = c1 [Aψ1 (x)] + c2 [Aψ2 (x)]. One can re-look at
this with already encountered operators, such as position operator (x), p, H, etc. A few
others are, parity operator, defined by Πψ(x) = Ψ(−x) or translation by a operator, i.e.
Ta ψ(x) = ψ(x + a). The product of operators are also easily generalized. The products
again follow the associative law but not the commutative law in general. The later is
important for QM and one defines the commutator between two operators A & B as
[A, B] = AB − BA; for eg. [x, px ] = iℏ.
57
The last quantity is the scalar product, in abstract form (without bases), which results
into a complex number. This scalar product can be expanded as per Eq.72 once we choose
x-bases. A useful identity that can be verified easily using Eq.72 is ⟨ϕ|ψ⟩∗ = ⟨ψ|ϕ⟩. In
mathematical jargon there are two LVSs, conjugate (or dual) of each other, with element
|ψ⟩ having conjugate as ⟨ψ|. When we work with matrix algebra, it turns out that these
two elements are actually Hermitian conjugates of each other, i.e. (|ψ⟩)† = ⟨ψ|, with the
ket being equivalent to a column matrix (or column vector) and a bra being equivalent to
a row matrix.
The wave-function ψ(x) actually gives a complex number at a specific value of x which
basically is the scalar product of the state |ψ⟩ with the basis-state corresponding to x, i.e
|x⟩. So ψ(x) = ⟨x|ψ⟩. One can easily draw an analogy with the LVS of vectors in 3D where
−
→ −
→
Vi , i.e. ith component of V , is the scalar product ubi · V . Similarly in p-representation
gψ (p) = ⟨p|ψ⟩. Obviously, gψ (p) and ψ(x) are two different mathematical functions but
−
→ −
→
they represent the same abstract quantum state. This is analogous to V and V ′ having
different elements in two different bases but they represent the same vector in 3D space.
We saw earlier that an abstract operator, defined over a LVS, gets a concrete form in
terms of its matrix elements. The same idea holds here with a general matrix element
of an operator A being ⟨ϕ|A|ψ⟩. It is insightful to recall ubj · Aubi for LVS of vectors in
3D. In x-representation, the general matrix element for operator A would be ⟨x′ |A|x⟩.
This is non-trivial since x is a continuous basis, which we postpone for later; still we can
write the matrix element for2 the Hamiltonian
operator for a particle in a potential V (x)
ℏ ∂
as ⟨x′ |H|x⟩ = δ(x′ − x) − 2m + V (x) or ⟨x′ |p|x⟩ = −iℏδ(x′ − x) ∂x
2 ∂
∂x2
. Please do not get
misled into believing that such H or p are diagonal operators in x-bases as the derivative
terms make them non-local and thus non-diagonal operators.
One can get certain insights into this aspect by analyzing the finite-element implemen-
tation of these operators for numerical calculations. In this case ψ(x) will take complex
values at discrete x locations, say . . .−3a, −2a, −a, 0, a, 2a, 3a . . ., making ψ(x) a column
matrix. Numerically the wave function ϕ(x) that would result from pψ(x) can easily be
imagined as a square matrix operating on the column matrix. Clearly this square matrix
will not be diagonal. More precisely, the equation |ϕ⟩ = p|ψ⟩ can be written as,
. . .. .. . . ..
.. . . ... ..
.
..
.
..
.
..
. . . . .
..
.
′
c−3 · · · 0 1 0 0 0 0 0 · · · c−3 c−2 − c−4
′
c−2 · · · −1 0 1 0 0 0 0 · · · c−2 c−1 − c−3
′
c−1 ℏ · · · 0 −1 0 1 0 0 0 · · · c−1
ℏ c0 − c−2
′
c0 = −i · · · 0 0 −1 0 1 0 0 · · · c0 = −i c1 − c−1
′ 2a 2a
c1 · · · 0 0 0 −1 0 1 0 · · · c1 c2 − c0
′
c2 · · · 0 0 0 0 −1 0 1 · · · c2 c3 − c1
′
c3 · · · 0 0 0 0 0 −1 0 · · · c3 c4 − c2
.. . .. .. .. .. .. .. . . .. ..
. . . . .. . . . . . . . . .
It is also easy to see that in above matrix form p is Hermitian as there is an ’i’ multiplying
this matrix. Also one can write the discrete
P matrix elements of p as pij = −i(ℏ/2a)(δi,j−1 −
δi,j+1 ) which leads to ⟨ϕ|p|ψ⟩ = −iℏ n ϕ(na)[ψ((n + 1)a) − ψ((n − 1)a)]/2a. This form
helps in better comprehending the form ⟨x′ |p|x⟩ = −iℏδ(x′ − x) ∂x ∂
. Now it is also quite
58
straightforward to conclude that an operator consisting of a form δ(x′ −x)V (x) is diagonal
in x-bases.
We’ll be working with state wave-functions and operators in specific orthonormal bases
discussed in the next section. As we shall see that a wave function |ψ⟩ in a discrete
bases reduces to a set of infinite discrete complex-numbers, ci = ⟨ui |ψ⟩, which we can also
imagine as an infinite size column matrix. The Hermitian conjugate of this wave function,
i.e. ⟨ψ|, would then be a row matrix consisting of c∗i . Similarly an operator A in such
discrete bases gets defined by its matrix elements Aij = ⟨ui |A|uj ⟩. Thus an operator can
be easily imagined as an infinite dimensional square matrix with elements Aij and we can
easily define its Hermitian conjugate, A† , consisting of elements (A† )ij = A∗ji .
This also gives an idea about the algebraic expressions involving bras, kets, operators
and complex numbers. One can, in principle come across four valid types of entities that
are eventually equivalent to
The expressions equivalent to product of two kets or two bras (like |ψ⟩|ϕ⟩ or ⟨ψ|⟨ϕ|) are
unphysical unless one is looking at the direct product states, which is beyond this course.
One often has to convert such complex expressions, such as c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |, into its
Hermitian conjugate, i.e. [c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |]† . For this, the prescription is,
†
1. The order of entities in the product gets reversed and the gets transferred to
individual entities, like operator A becomes A† .
So the expression (c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |)† would simplify to (c∗1 |ψ4 ⟩⟨ψ3 |A† |ψ2 ⟩⟨ψ1 |) which
is also equivalent to (c∗1 ⟨ψ3 |A† |ψ2 ⟩|ψ4 ⟩⟨ψ1 |). A useful exercise at this stage is to analyze
the Hermitian conjugates of A† , cA, AB, A + B, ⟨ψ|A, A|ϕ⟩.
When looking at composite expressions consisting of sums of such terms, one has to
realize that we can only add similar type expressions, i.e. column (row) matrix with a
column (row) matrix, a square matrix with a square matrix and complex number with
another complex number. Thus expressions such as c1 |ψ⟩ + c2 ⟨ϕ| or c1 |ψ⟩ + c2 or c1 |ψ⟩ + A
are meaningless.
59
We already have seen examples of two continuous basis-sets, namely x and p. The
continuous bases offer more conceptual difficulty though they are more often used. In
fact, it turns out that the complex-functions corresponding to basis-sets for continuous
bases, such as exp(ipx/ℏ) or δ(x − x0 ) are not normalizable and thus they are not valid
wave-functions themselves. This means that these continuous basis-sets do not belong to
the LVS itself. However, if we take a careful superposition of these basis-functions we can
ensure that the resulting complex functions are valid wave-functions. On the other hand
these basis-functions (like plane-waves) are often used, like in scattering, despite their
non-normalizability. However, one can justify them by putting some envelope (such as a
box function or some other function vanishing at infinity) in real or k-space. Thus such
functions essentially retain their plane wave character but they become normalizable. In
this case one can look at the basis-functions as some sort of limit of a series of wave-
functions.
The discrete bases are also encountered in QM and these are easier to comprehend.
p example is, the eigen states for a particle in an infinite potential well, i.e. ψn (x) =
An
2/L sin(nπx/L). This set {ψn (x)} gives a discrete set of functions whose superposition
will give any arbitrary wave-function that vanishes at x = 0 & x = L. Another example
is the eigen state wave-functions of a simple harmonic oscillator.
In fact, the complete set of eigen-states corresponding to a given Hamiltonian forms a
basis. However, we may end up getting mixed basis-set in case the Hamiltonian has both,
bound states and scattering states. The two examples (box and SHO) that I gave do not
have scattering states. It is much easier to discuss the formalism for discrete bases and we
can generalize to continuous bases by replacing the discrete summations by appropriate
integrals. The summations will also span infinite range as this is an infinite-dimensional
LVS. We shall use the Dirac notation and I hope you’ll get more comfortable with it as
we move forward. The Dirac notation is so convenient and popular that I cannot imagine
doing any QM without it.
We start with a discrete but infinite set {ui (x)} of orthonormal basis-functions, i.e.
{u1 (x), u2 (x), u3 (x), u4 (x), ...}. Now these are already written in x-representation, how-
ever we can bring the Dirac notation to represent an abstract basis-state as |ui ⟩ and its
Hermitian conjugate as ⟨ui |. This basically means ui (x) = ⟨x|ui ⟩ and u∗i (x) = ⟨ui |x⟩.
This (or any) basis-set satisfies the following:
1. {|uj ⟩} or {ui (x)} forms a complete set, i.e. any valid state |ψ⟩ with wave-function
ψ(x) can be written as a linear superposition P of these bases. We Pcan write this in
both Dirac notation and otherwise: |ψ⟩ = i ci |ui ⟩ or ψ(x) = i ci ui (x). The ci
values will be same in both ways of writing. Here ci s are complex numbers and the
sum, in general, would run over an infinite range of i.
2. Orthonormality:
Z ∞
⟨ui |uj ⟩ = u∗i (x)uj (x)dx = δij . (76)
−∞
The coefficients ci s represent ui (x)th component inR ψ(x). To find the R c∗i we take the
scalar
R ∗ product
P of |ψ⟩ with
P |uRi ⟩ to get, ⟨ui |ψ⟩ =
P ⟨ui |x⟩⟨x|ψ⟩dx = ui (x)ψ(x)dx =
ui (x) j cj uj (x)dx = j cj u∗i (x)uj (x)dx = j cj δij = ci . Thus ci = ⟨ui |ψ⟩ and we
60
P P
can write ψ(x) = ⟨x|ψ⟩ = i ⟨x|ui ⟩⟨ui |ψ⟩ = i ci ui (x). Effectively, what we have done
here is to introduce a unity operator, i.e.
X
|ui ⟩⟨ui | = 1 (77)
Z ∞
or |x⟩⟨x|dx = 1 (78)
−∞
in between the bra and ket in ⟨x|ψ⟩ or ⟨ui |ψ⟩. The operator in Eq.77 is a sum of operators
|ui ⟩⟨ui |, which is called a projection operator as it projects out |ui ⟩ component from any
given state |ψ⟩, i.e. |ui ⟩⟨ui |[|ψ⟩] = |ui ⟩⟨ui |ψ⟩ = |ui ⟩ci . One can similarly define a general
projection operator Pϕ = |ϕ⟩⟨ϕ| to project out |ϕ⟩ from any given state.
Also we can work out the scalar product between P abovePψ(x) and another wave-
function ϕ(x) which in {ui (x)} bases is ϕ(x) = i bi ui (x) = Pi ⟨ui |ϕ⟩⟨x|ui ⟩. Using P ∗the
unity operator from Eq.77, the scalar product will give, R ⟨ϕ|ψ⟩ = i ⟨ϕ|ui ⟩⟨u
R i∗|ψ⟩ = i bi ci .
Or we
RP ∗ ∗ can work it out using Eq.78 as ⟨ϕ|ψ⟩ = ⟨ϕ|x⟩⟨x|ψ⟩dx = ϕ (x)ψ(x)dx =
ij bi ui (x)cj uj (x)dx which leads to the same P
final expression after using the orthonor-
R
mality of {ui (x)}. Also we can see, ⟨ψ|ψ⟩ = i |ci |2 = |ψ|2 dx.
The identity operators in Eq.77 & 78 are statements of closure (or completeness) for
the basis sets {|x⟩} and {|ui ⟩} in the sense that any wave-function can be written as linear
superposition of these basis functions.
P There is P another Rrelation that follows from this
completeness: ψ(x) = ⟨x|ψ⟩ = i ⟨x|ui ⟩⟨ui |ψ⟩ = i ui (x) u∗i (x′ )ψ(x′ )dx′ . Here we have
written the R P scalar product ⟨ui |ψ⟩ using the integral form in x-representation. This yields
ψ(x) = [ i ui (x)u∗i (x′ )] ψ(x′ )dx′ and for this to be true for any arbitrary ψ we must
have
X X
ui (x)u∗i (x′ ) = ⟨x′ |ui ⟩⟨ui |x⟩ = δ(x − x′ ). (79)
i
This is equivalent to Eq.77 (and Eq.78) if one inserts it in ⟨x|x′ ⟩ which is δ(x − x′ ). This
can be thought as the orthonormality condition for continuous x-bases.
As discussed in previous section an operator gets its concrete form in a given bases
as a square matrix, which one can use to find the operator’s matrix elements between
two general wave-functions. For instance one can write ⟨ϕ|A|ψ⟩ hP by inserting
i unity op-
P
erator of Eq.77 in two places to get ⟨ϕ| [ i |ui ⟩⟨ui |] |A| j |uj ⟩⟨uj | |ψ⟩. This gives
P P ∗
ij ⟨ϕ|ui ⟩⟨ui |A|uj ⟩⟨uj |ψ⟩, i.e. ij bi Aij cj i.e. the product of three matrices similar
to Eq.67 except for infinite number of rows hR and columns
i hinR each. We can i also use
∞ ∞
the Eq.78 in ⟨ϕ|A|ψ⟩ to write it as ⟨ϕ| −∞ |x⟩⟨x|dx |A| −∞ |x′ ⟩⟨x′ |dx′ |ψ⟩ to get
R∞ R∞ ∗ R∞
−∞ −∞
ϕ (x)⟨x|A|x′ ⟩ψ(x)dxdx′ which becomes −∞ ϕ∗ (x)Aψ(x)dx after we use ⟨x|A|x′ ⟩ =
Aδ(x′ − x).
Similarly, one can analyze other entities, such as A|ψ⟩ which will be a column matrix
resulting from the product of a square matrix A and column matrix for |ψ⟩, or ⟨ψ|A which
will be a row matrix resulting from the product of a row matrix corresponding to ⟨ψ| with
square matrix A. One P can show that in the former case the ith element of the column
matrix will
P be given by j Aij cj while in the latter case the ith element of the row matrix
will be j c∗j Aji . It’s is a good exercise to think of Hermitian conjugates of these two
results.
61
9.3 Bases change using Dirac notation:
If we look at the previous section, we basically practiced change of bases between |x⟩-
bases and the discrete |ui ⟩-bases. We can also look at change of bases between two discrete
bases, i.e. from |ui ⟩ to |vi ⟩ and how it transforms the wave functions and operators. In
fact one example is to change from x to p bases and we know that the wave functions
transform according√ to Eqs.74 & 75, which can be√revisited using the Dirac notation and
with ⟨x|p⟩ = (1/ 2πℏ) exp(ipx/ℏ) or ⟨p|x⟩ = (1/ 2πℏ) exp(−ipx/ℏ). We also discussed
in section 8.3 the discrete change of bases but without using the Dirac notation and for
real LVS. The Dirac notation P makes thePbases change very convenient,
P at
P least notation-
wise. We start with |ψ⟩ = i ⟨ui |ψ⟩ = i ci |ui ⟩ and |ψ⟩ =P i ⟨vi |ψ⟩ = i c′i |vi ⟩. Taking
the scalar product of firstP form with |vj ⟩ we get ⟨vj |ψ⟩ = i ci ⟨vj |ui ⟩ and then defining
′
Sji = ⟨vj |ui ⟩ we get cj = i Sji ci . This transformation matrix S thus defines the linear
relation between c′i and ci and the state vector in |vi ⟩ bases will be given by a column
matrix that results when one multiplies S with the column matrix of |ui ⟩ bases. We
can also think of the inverse transformation and in that case one has to multiply by S −1
with the column matrix in |vi ⟩ bases. For the two orthonormal bases, S works out to be
unitary, i.e. SS † = 1, and so S −1 = S † . It’s left as an exercise to prove the unitarity
starting from the elements of S, i.e. Sji = ⟨vj |ui ⟩ and using the orthonormality of the two
bases.
We can also look at how an operator A transforms, as follows. We have,
Thus we can see that the operator in |vi ⟩ bases is given by A′ = SAS † . We can also argue
that the trace and determinant of the operator remains independent of the bases by using
the facts that for a product of two matrices the trace and determinant are independent
of the order of the product.
62
10 Quantum Operators:
We review some of the properties of operators that are useful for QM. We begin by stating
that the operators associated with physically measurable quantities are always Hermitian
as they are guaranteed to have real eigen values. We recall that a measurement always
yields an eigen value. We have already seen in context of Hamiltonian and Schrödinger
equation what eigen values and eigen states mean. We shall be using the Dirac notation
for further discussions of this section.
When we solve TISE, we saw how the eigen-energies and eigen-functions are calculated
by solving the differential equation. This was the case in continuous x-basis. In discrete
basis for N -dimensional systems, the eigen value equation, i.e. Eq. 80, leads to N linear
homogeneous equations for finding the N elements of eigen vector |ψλ ⟩. This set of N -
equations will admit non-trivial solutions if and only if Det[A − λI] = 0. This leads to an
N th order polynomial equation in λ which will have N roots say {λi }. In case of equal
roots we get degenerate eigen vectors, i.e. set of linearly independent eigen-vectors having
same eigen values. The ith eigen vector can then be found by using λi in Eq. 80 to find
the elements of |ψλi ⟩.
Theorem-1: If A is Hermitian than all its eigen values λ are real and, conversely, if
all eigen values of A are real than it is Hermitian.
Proof: We have A|ϕ1 ⟩ = λ1 |ϕ1 ⟩ and A|ϕ2 ⟩ = λ2 |ϕ2 ⟩. Now consider ⟨ϕ2 |A|ϕ1 ⟩∗ =
λ∗1 ⟨ϕ2 |ϕ1 ⟩∗ . This is also equal to ⟨ϕ1 |A† |ϕ2 ⟩ which gives λ2 ⟨ϕ1 |ϕ2 ⟩. Subtracting the two, we
get (λ2 − λ1 )⟨ϕ1 |ϕ2 ⟩ = 0. Thus we get the required result that if λ1 ̸= λ2 than ⟨ϕ1 |ϕ2 ⟩ = 0,
i.e. |ϕ1 ⟩ and |ϕ2 ⟩ are orthogonal.
63
The above theorem-2 will also apply to unitary operators which are also very important
in QM. The proof will follow the same logic together with the fact that U † and U have
common eigen states but with eigen values that are complex conjugate of each other. This
is easy to comprehend as a unitary operator U can always be written as U = exp(iλA)
with A as a Hermitian operator and λ as a real number. Can you prove the last sentence?
Theorem-1: If A and B are two commuting operators and if |ψλ ⟩ is an eigen state
of A with eigen value λ than the state B|ψλ ⟩ is also an eigen state of A with the same
eigen value.
Proof: The proof is almost trivial but the implications are very important. Since
AB = BA and A|ψλ ⟩ = λ|ψλ ⟩. Thus we get AB|ψλ ⟩ = BA|ψλ ⟩ = λB|ψλ ⟩ which im-
plies that B|ψλ ⟩ is an eigen state of B. Also, please note that the proof does not require
A & B to be Hermitian.
This leads to two possible cases:
1. λ is a non-degenerate eigen value of A, i.e. there is only one state of A with eigen
value λ. In this case B|ψλ ⟩ and |ψλ ⟩ can differ at most by a multiplicative constant,
i.e. B|ψλ ⟩ ∝ |ψλ ⟩ or B|ψλ ⟩ = µ|ψλ ⟩, with µ as a complex number. This implies that
|ψλ ⟩ is also an eigen state of B (with certain eigen value µ).
2. λ is a degenerate eigen value of A, i.e. there are many eigen states, say |ψλi ⟩, of A
with same eigen value λ. In this case we can only assert that B|ψλi ⟩ would be a
64
linear combination of all such |ψλi ⟩ states and it cannot be an eigen state of A with
different eigen value. In fact B|ψλi ⟩ will be orthogonal to all other eigen states of A
having different eigen values, i.e. ⟨ψµi |B|ψλi ⟩ = 0 with |ψµi ⟩ as an eigen state of A
with eigen value µ (̸= λ).
Theorem-2: If A & B commute, one can construct an orthonormal basis-set of the state
space with the basis vectors being eigen vectors of both A & B. Conversely, If there exists
a basis of eigen vectors common to A and B then A and B commute.
We shall not prove this theorem but illustrate it with an example. Although the proof of
the converse is easy as both A and B will be diagonal operators in the referred common
basis-set of eigen vectors and the diagonal operators always commute. We can easily
construct a simple illustration using A and B in basis |u1 ⟩, |u2 ⟩ & |u3 ⟩ as follows,
1 0 0 1 0 0 1 0 0
A = 0 1 0 and B = 0 2 0 . Here, |u1 ⟩ = 0 , |u2 ⟩ = 1 and |u3 ⟩ = 0 .
0 0 2 0 0 1 0 0 1
Clearly, AB = BA and both have degeneracies. A has two independent eigen states |u1 ⟩
& |u2 ⟩ with eigen value 1. Thus a general linear combination of the two, i.e. c1 |u1 ⟩+c2 |u2 ⟩
is also an eigen state of A with eigen value 1. The third eigen state of A is |u3 ⟩ with eigen
value 2. For B, b1 |u1 ⟩ + b3 |u3 ⟩ gives the doubly degenerate eigen state with eigen value
1 and |u2 ⟩ is the third eigen state with eigen value 2. As given we already have chosen a
basis encompassing the eigen states of both A & B.
Now one can refer to the three eigen states uniquely by labeling them as (a, b) with a
and b as eigen values wrt A & B. This gives |u1 ⟩ → (1, 1), |u2 ⟩ → (1, 2) and |u3 ⟩ → (2, 1).
Thus B helps in lifting the degeneracy of A completely, or vice-versa, and with systematic
labeling and orthogonal states. The last one definitely holds in case both A and B are
Hermitian or unitary. In general one may need more than two commuting operators to
lift the degeneracy completely.
This was an illustration using Dirac notation and simple 3 dimensional Hilbert space.
We can think of more realistic and physical examples (listed below) in QM where one
takes help of other operators, describing certain symmetry, to lift all the degeneracies and
to systematically label the eigen states.
1. Here we use p for momentum operator and p for a number representing an eigen
2
value of p. Free particle, with Hamiltonian
√ H = p /2m, gives√eigen states of H
with eigen value E as ψE (x) = c1 exp(i 2mEx/ℏ) + c2 exp(−i 2mEx/ℏ). Thus
there is degeneracy of two. Two ways that can be used to lift this degeneracy are:
1) H commutes with p, 2) H commutes with Π (parity). In the former case we
get eigen states common to H and p as exp(ipx/ℏ) with eigen values as p and
p2 /2m wrt to the two operators.
√ In the latter case,
√ we can choose the eigen states
common to H and Π as sin( 2mEx/ℏ) and cos( 2mEx/ℏ) with eigen values as E
and ±1 wrt to the two operators. We note that p and Π do not √ commute so we
cannot
√ find common eigen-states for these two. More explicitly sin( 2mEx/ℏ) and
cos( 2mEx/ℏ) are not eigen states of p and exp(ipx/ℏ) is not an eigen state of Π.
65
2. Another example that we are familiar with is the hydrogen atom. In this case H
commutes with L2 , i.e. total angular momentum, and Lz , i.e. z-component of
angular momentum. Thus one can lift all the degeneracies of H (excluding spin)
using L2 and Lz and label the states uniquely using eigen values corresponding to
the three operators, i.e. (n, l, m). Here, eigen value wrt H is −E0 /n2 , wrt L2 is
l(l + 1)ℏ2 and wrt Lz , it is mℏ. In fact L2 and Lz commute with H due to some
underlying symmetries of H.
The set of operators that commute with each other and lift the degeneracy completely
is called complete set of commuting operators (CSCO). This is important in QM as
this gives a systematic way to keep track of all the eigen states. The latter helps in
keeping account of states as total number of states with a given energy controls many
observable phenomena. We’ll be seeing another problem where these ideas come in handy,
i.e. periodic potential which is important for understanding the physics of electrons in
solids.
66
11 Periodic potentials:
In a given solid the electrons experience a periodic potential due to atomic cores arranged
in a periodic fashion. Different solids have different crystal structure and with different
atoms. This leads to a variety of properties. To understand these properties we need to
first obtain the energies and wave-functions of the states that the electrons will occupy. It
turns out that these energies consists of continuous bands separated by gaps in between.
Finding this band structure of a given solid is an important problem in solid state physics
as this is the first step towards understanding various properties of a given solid.
We shall start with a general formalism on how to find the possible energy states that
electrons will assume in periodic potentials. The wave-function of these states takes the
form of a periodically modulated plane wave as dictated by Bloch’s theorem and thus
these states are also called Bloch states. Further we shall discuss the periodic boundary
conditions as any given solid has large number of atoms when we look at macroscpic scale
but it’s not mathematically infinite and thus not really periodic in mathematical sense.
This also leads to only certain Bloch states being allowed.
67
⟨ϕ|c∗ c|ϕ⟩. Using the unitary property of Ta we get ⟨ϕ|ϕ⟩ = |c|2 ⟨ϕ|ϕ⟩ which implies |c|2 = 1
as we are analyzing the non-trivial eigen state |ϕ⟩. The unimodular nature of c implies
that it has a general form c = eiα with α as a real number. It is useful to find alternative
proofs of the statements used here as that leads to more insights.
All this means is that we can always find eigen states of H that are also eigen states
of Ta , i.e. they satisfy Ta ψ(x) = cψ(x) = eiα ψ(x) or
I have used a subscript α with ψ as it refers to a wave-function having eigen value exp(iα)
wrt operator Ta . Eq.82 is essentially the statement of the Bloch’s theorem but it is usually
stated after imposing the periodic boundary condition which makes the values of α more
explicit as discussed next. We should also keep in mind that both H and Ta may have
degeneracies and at this stage it is not ruled out that combination of H and Ta still leave
some unresolved degeneracies. Use this as food for further thoughts.
This can be thought of in 1D as wrapping the lattice into a circle so that (N + 1)th site
coincides with 1st site and everything, including potential and wave-function, repeats when
one completes a full circle. Now this makes the crystal of our interest mathematically
periodic.
Repeating Eq.82 N -times repeatedly, we get ψ(x + N a) = exp(iN α)ψ(x) and then
using in Eq.83 we get exp(iN α) = 1 which gives N α = 2nπ or α = 2nπ/N . Here, n will
range from 0 to N − 1. Beyond this range exp(2nπ/N ) will map to one of the values for
n within 0 to N − 1 for this exp factor. For instance if I take n = N − 1 + n1 (i.e.> N ),
I get exp[i(2π + 2(n1 − 1)π/N )], i.e. a repeat of n − N (=n1 − 1) as e2πi is trivially one.
This amounts to saying that if exp(iα) is an eigen value of Ta then exp(iα + 2πi) is also
an eigen value and for the same eigen state.
68
In the end what matters is that we keep n values spanning a range of N different
values. So we might as well use n ∈ [− N2 , N2 − 1] or n ∈ [− N2 + 1, N2 ] and for large N
this N2 − 1 or N2 + 1 can be taken as N2 . In solid state physics, for given lattice period
a, one rather uses k = αa = 2πn
Na
in place of α. Here k takes N different values separated
by ∆k = N2πa = 2πL
with L as crystal length. k covers a range k ∈ [− πa , πa − N2πa ] which for
large N can be taken as k ∈ [− πa , πa ]. With k taking place of α we can restate Eq.82 as
This is precisely how the Bloch theorem is stated in most textbooks. We can see that
Bloch wave-function ψk (x) is not periodic except for a special value of k = 0. It turns out
that the Bloch wave function actually represents a periodically modulated plane wave,
i.e.
with uk (x) being periodic with lattice period, i.e. uk (x + a) = uk (x). Think what a
periodically modulated plane wave is and convince yourself on how Eq.85 represents a
periodically modulated plane wave. Eq.85 and Eq.84 are equivalent and both are inter-
changeably used as Bloch’s theorem statement. It is left as an exercise to prove that
Eq.85 indeed satisfies Eq.84. You can also prove the converse if you feel up to it [Hint: Is
e−ikx ψ(x) periodic?].
Figure 32: Left: Potential in KP model as an array of finite depth wells with period a.
Right: The KP potential in the δ-function limit.
The band-structure calculations for real systems (even in 1D) are rather cumbersome
and one needs to make many approximations. More than one full semester courses will be
required to learn the state of the art techniques for band structure calculations. However,
to get a flavor I discuss a simple 1D model, called Kronig-Penney (KP) model, which is
also exactly solvable in the δ-function limit. In fact, I’ll only solve the δ-function limit of
this model. The potential for the KP model is shown in Fig.32. The left one shows it as
a periodic array (period a) of finite potential wells separated by barriers of height V0 and
69
width b while the right one shows it in the δ function limit. In the latter case V0 → ∞
and b → 0 such that bV0 = γ stays constant. We can write, in the latter case,
X
V (x) = γ δ(x − na). (86)
n
We shall be using periodic boundary conditions with a large N and Bloch’s theorem,
i.e. Eq.84. We do not know the eigen energy (0 < E < ∞) values and we also have to
find the corresponding wave-functions. Since the potential vanishes except when x = na,
we have the TISE away from these points as −ℏ
2 d2
ψ (x) = Eψk (x). Thus we can write
2m dx2 k
the solution,
for 0 < x < a, ψk (x) = A exp(iλx) + B exp(−iλx) (87)
Here A, B and λ2 = 2mE/ℏ2 will be dependent on k. Using Eq.85 we can write, ψk (x) =
exp(−ika)ψk (x + a), which can be used to assert,
for − a < x < 0, ψk (x) = exp(−ika)[Aeiλ(x+a) + Be−iλ(x+a) ] (88)
Now this ψk (x) has to satisfy two boundary conditions at x = 0, namely,
1. Continuity, i.e.
ψk (x)|x=0+ = ψk (x)|x=0− (89)
Our goal here is to find A, B and E as a function of k. We use Eq.87 & 88 in Eq.89 to
get
(A + B) = e−ika (Aeiλa + Be−iλa )
or B[1 − e−i(k+λ)a ] = A[e−i(k−λ)a − 1] (91)
and we use Eq.87 & 88 in Eq.90 to get
2mγ
iλ(A − B) − e−ika iλ(Aeiλa − Be−iλa ) = 2 (A + B)
ℏ
−i(k+λ)a 2mγ −i(k−λ)a 2mγ
or B −iλ + iλe − 2 = A −iλ + iλe + 2 (92)
ℏ ℏ
Now to eliminate A & B, we cross multiply Eq.91 & Eq.92 to get
2mγ 2mγ
iλ − iλe−i(k+λ)a + 2 − iλe−i(k−λ)a + iλe−2ika − 2 e−i(k−λ)a
ℏ ℏ
−i(k−λ)a 2mγ 2mγ
= −iλ + iλe + 2 + iλe−i(k+λ)a − iλe−2ika − 2 e−i(k+λ)a
ℏ ℏ
−ika iλa −iλa −2ika 2mγ −ika −iλa
or 2iλ − 2iλe [e + e ] + 2iλe − 2 e [e − e−iλa ] = 0
ℏ
−ika −2ika 2mγ −ika
or 2iλ − 2iλe 2 cos(λa) + 2iλe − 2 e 2i sin(λa) = 0
ℏ
70
Dividing by 2iλe−ika and introducing P = mγa/ℏ2 , we get
2P
eika + e−ika −2 cos(λa) − sin(λa) = 0
λa
2P
or 2 cos(ka)−2 cos(λa) − sin(λa) = 0
λa
P
or cos(ka) = sin(λa) + cos(λa) (93)
λa
This is the transcendental equation to find valid λ for a given k value. Given this λ(k)
one can also deduce B in terms of k by using Eq.91 (or Eq.92) and A. As usual A will be
dictated by the normalization condition. In order to find λ for given k one has to solve
Eq.93 numerically.
71
11.3.1 Energy Bands in KP Model
We review a graphical method to solve the above transcendental equation to find the
P
eigen-energy or λ values. Fig.33 shows the plot of λa sin(λa) + cos(λa), i.e. RHS of Eq.93
for P = 5, as a function of λa in units of π. Please look at the plot and the expression
being plotted to convince about some limiting points such as λa = 0, π/2, π, 2π, etc.
Also, can you try to find the lowest λa for which this expression vanishes?
Figure 33: Plot of the RHS of Eq.93 for P = 5 as a function of λa. The red and blue
horizontal lines represent +1 and -1, respectively, corresponding to ka = 0 & π.
We know that k will take N different discrete values in interval (−π/a, π/a] separated
by ∆k = π/N a and thus cos(ka) will take values between -1 and +1. For instance,
for k = 0, cos(ka) = 1 which is shown as the red horizontal line in Fig.33. This line
intersects the KP-curve at many points. Each intersection point can be projected on the
horizontal axis as shown by red dots (for cos ka = +1) on horizontal axis in Fig.33. The
λ values corresponding to these dots give energy E = ℏ2 λ2 /2m. Thus for each given k
we see that there are many (discrete) energies possible. Similarly for k = π/a (or −π/a),
cos(ka) = −1 and the line corresponding to this is shown as blue horizontal line, which
again intersects KP-curve at many points that can be projected as blue dots.
For any given k the horizontal line corresponding to cos(ka) will lie within these two
red and blue lines. There are some ranges of λ for which the KP-curve will not intersect
horizontal cos(ka) line corresponding to any k. These values (between two consecutive
red or blue dots) of λ are forbidden as depicted in Fig.34 implying that certain range
(or bands) of energies are forbidden. The allowed bands of energies or λ (i.e. between
a red dot and the immediate blue dot or vice-versa) are separated by band-gaps. Also,
a horizontal line corresponding to cos(ka) for a given k intersects the KP-curve at many
72
points with one point in each allowed band. This implies that there are many (discrete)
energies possible for a given k with each energy lying in a different band.
We also see that k takes rather continuous values as ∆k = 2π/N a is very small for
large N . Thus we get continuous energy bands separated by clear gaps. Please also
note that a general intersection point in Fig.33 will correspond to two different k values
that differ in sign as cos ka is an even function of k. This, in fact, implies that there
is a degeneracy of two for each energy and the Bloch wave-vector k helps in lifting this
degeneracy. This aspect is further discussed later on. If we numerically calculate these
energy bands by finding λ values for each k we get E as a function of k for different bands
as shown in the left panel of Fig.35 for P = 5. The black lines in this plot are the bands
that are separated by energy gaps from the neighboring ones.
Figure 34: Plot of the KP equation with the grey regions showing forbidden bands of λa
responsible for the band gaps in energy.
It is insightful to analyze how this band structure will evolve when one varies P from
zero (i.e. no δ-functions) to infinity. For P → ∞, the potential will be equivalent to an
array of infinite wells while for P = 0 it becomes a free particle. Thus for P → 0 we should
get free particle like energies and for P → ∞ it should give particle-in-a-box energies.
The plot in the right panel of Fig.35 shows the evolution of the RHS of Eq.93 when P
increases from 5 to 40. From this we can see that the bands are expected to get narrow.
In fact the bands in P → ∞ limit collapse on the red lines of the left plot in Fig.35 which
are precisely the energies corresponding to the particle-in-a-box. On the other hand when
P → 0 we see that the energy-gaps will approach zero and we get an E(k) which is
equivalent to the free particle result, i.e. E = ℏ2 k 2 /2m, except that different portions
of the free particle E(k) have been shifted and brought into the k-region from −π/a to
π/a. In this model P is a measure of interaction between the neighboring potential wells
73
with large P representing small interaction. We see that large interaction (small P ) we
get broad bands and for small interaction (large P ) we get narrow bands. This is a very
general point which gets captured in this simple and exactly solvable model.
Figure 35: Left: Energy band diagram E(k) for KP model for P = 5. The red lines mark
the energies corresponding to infinite well potential. Right: plot of the KP equation for
different P values depicting the narrowing of the allowedλ ranges with increasing P .
74
11.3.2 Wave-functions in KP Model:
One can also plot some characteristic wave-functions of the KP model to internalize the
modulated plane-wave character and other aspects. This requires knowing B as a function
of k and A while A, determining the overall scale of ψ, can be left out as a normalization
constant. In fact a general state, its energy and wave-function depends on two things: k
and the band-index. Thus we symbolically write En (k) for k-dependent energy of different
bands and ψnk for the band-index (n) and k dependent wave-function. Some of the wave-
Figure 36: Plots of the Re[ψ], Im[ψ] (red and blue) and |ψ| for the bands and ka as
marked. The last one, with much longer x-range, is for E just above the bottom of the
first band with ka ≃ 0.012π.
75
functions for KP model are plotted in Fig.36. We can list a few general observations
about these wave functions as follows:
1. The odd band (1st, 3rd, 5th,..) minimum energy wave functions have k = 0 and
max energy wave functions have ka = π.
2. The even band (2nd, 4th, 6th,..) minimum energy wave functions have ka = π and
max energy wave functions have k = 0.
3. For k = 0 the wave functions are periodic with a as ψk=0 (x + a) = ψk=0 (x).
4. For ka = π the wave functions change sign in every consecutive periodic, i.e.
ψka=π (x + a) = −ψka=π (x).
5. With increasing energy the average number of nodes per unit length increases. For
lowest energy there are no nodes.
6. The highest energy state of the odd bands correspond to ka = π and those of the
even bands correspond to k = 0. In fact, if we look at K-P plot in Fig.33, the
maximum energy point of all the bands corresponds to λa = nπ with n ̸= 0. These
highest energy states have energy same as that of particle in an infinite well. These
wave functions can also be mapped to infinite well wave-functions as the wave-
functions vanish at the δ-function location, i.e. x = na leading to no discontinuity
in its derivative.
7. Finally we can recognize in the last plot of Fig.36 the modulated plane wave nature
of the wave function, plotted over a large x-range, shows a period-a modulated
plane wave. The plane-wave envelope has a large wavelength given by 2π/k >> a,
see Eq.85. The real and imaginary parts of the wave-function evolve spatially in
quadrature with |ψk (x + a)| = |ψk (x)|.
76
cases. Given the number of electrons, arising from the valence electrons of atoms, we fill
these electrons into these bands respecting Pauli exclusion principle for electrons so that
we put two electrons per energy and per k state. We’ll end up filling certain bands fully
and in several cases we’ll have partially filled bands. In the former case the solid behaves
as an insulator (or semiconductor) as one needs a minimum energy, equal to band gap,
to excite electron. In the later case, i.e. partially filled bands, we get a metal.
77