Absolute Continuity and Density Functions

13: Absolute Continuity and Density Functions

Basic Theory
Our starting point is a measurable space (S, S ). That is S is a set and S is a σ-algebra of subsets of S . In the last section,
we discussed general measures on (S, S ) that can take positive and negative values. Special cases are positive measures,
finite measures, and our favorite kind, probability measures. In particular, we studied properties of general measures, ways
to construct them, special sets (positive, negative, and null), and the Hahn and Jordan decompositions.
In this section, we see how to construct a new measure from a given positive measure using a density function, and we
answer the fundamental question of when a measure has a density function relative to the given positive measure.

Relations on Measures
The answer to the question involves two important relations on the collection of measures on (S, S ) that are defined in
terms of null sets. Recall that A ∈ S is null for a measure μ on (S, S ) if μ(B) = 0 for every B ∈ S with B ⊆ A . At
the other extreme, A ∈ S is a support set for μ if A is a null set. Here are the basic definitions:

Suppose that μ and ν are measures on (S, S ).

1. ν is absolutely continuous with respect to μ if every null set of μ is also a null set of ν . We write ν ≪ μ .
2. μ and ν are mutually singular if there exists A ∈ S such that A is null for μ and A is null for ν . We write μ ⊥ ν .

Thus ν ≪ μ if every support support set of μ is a support set of ν . At the opposite end, μ ⊥ν if μ and ν have disjoint
support sets.

Suppose that μ , ν , and ρ are measures on (S, S ). Then

1. μ ≪ μ , the reflexive property.
2. If μ ≪ ν and ν ≪ ρ then μ ≪ ρ , the transitive property.

Recall that every relation that is reflexive and transitive leads to an equivalence relation, and then in turn, the original
relation can be extended to a partial order on the collection of equivalence classes. This general theorem on relations leads
to the following two results.

Measures μ and ν on (S, S ) are equivalent if μ ≪ ν and ν ≪ μ , and we write μ ≡ ν . The relation ≡ is an
equivalence relation on the collection of measures on (S, S ). That is, if μ , ν , and ρ are measures on (S, S ) then
1. μ ≡ μ , the reflexive property
2. If μ ≡ ν then ν ≡ μ , the symmetric property
3. If μ ≡ ν and ν ≡ ρ then μ ≡ ρ , the transitive property

Thus, μ and ν are equivalent if they have the same null sets and thus the same support sets. This equivalence relation is
rather weak: equivalent measures have the same support sets, but the values assigned to these sets can be very different. As
usual, we will write [μ] for the equivalence class of a measure μ on (S, S ), under the equivalence relation ≡.

If μ and ν are measures on (S, S ), we write [μ] ⪯ [ν ] if μ ≪ ν . The definition is consistent, and defines a partial
order on the collection of equivalence classes. That is, if μ , ν , and ρ are measures on (S, S ) then
1. [μ] ⪯ [μ] , the reflexive property.
2. If [μ] ⪯ [ν ] and [ν ] ⪯ [μ] then [μ] = [ν ] , the antisymmetric property.
3. If [μ] ⪯ [ν ] and [ν ] ⪯ [ρ] then [μ] ⪯ [ρ], the transitive property

The singularity relation is trivially symmetric and is almost anti-reflexive.

Suppose that μ and ν are measures on (S, S ). Then

1. If μ ⊥ ν then ν ⊥ μ , the symmetric property.
2. μ ⊥ μ if and only if μ = 0 , the zero measure.

Absolute continuity and singularity are preserved under multiplication by nonzero constants.

Suppose that μ and ν are measures on (S, S ) and that a, b ∈ R ∖ {0} . Then
1. ν ≪ μ if and only if aν ≪ bμ .
2. ν ⊥μ if and only if aν ⊥ bμ .

There is a corresponding result for sums of measures.

Suppose that μ is a measure on (S, S ) and that ν is a measure on

i (S, S ) for each i in a countable index set I .
Suppose also that ν = ∑ ν is a well-defined measure on (S, S ).
i∈I i

1. If νi ≪ μ for every i ∈ I then ν ≪ μ .

2. If νi ⊥μ for every i ∈ I then ν ⊥ μ .

As before, note that ν = ∑ ν is well-defined if ν is a positive measure for each i ∈ I or if I is finite and ν is a finite
i∈I i i i

measure for each i ∈ I . We close this subsection with a couple of results that involve both the absolute continuity relation
and the singularity relation

Suppose that μ , ν , and ρ are measures on (S, S ). If ν ≪ μ and μ ⊥ ρ then ν ⊥ρ .


Suppose that μ and ν are measures on (S, S ). If ν ≪ μ and ν ⊥μ then ν =0 .


Density Functions
We are now ready for our study of density functions. Throughout this subsection, we assume that μ is a positive, σ-finite
measure on our measurable space (S, S ). Recall that if f : S → R is measurable, then the integral of f with respect to μ
may exist as a number in R = R ∪ {−∞, ∞} or may fail to exist.

Suppose that f : S → R is a measurable function whose integral with respect to μ exists. Then function ν defined by

ν (A) = ∫ f dμ, A ∈ S (3.13.1)


is a σ-finite measure on (S, S ) that is absolutely continuous with respect to μ . The function f is a density function of
ν relative to μ .


The following three special cases are the most important:

1. If f is nonnegative (so that the integral exists in R ∪ {∞} ) then ν is a positive measure since ν (A) ≥ 0 for A ∈ S .
2. If f is integrable (so that the integral exists in R), then ν is a finite measure since ν (A) ∈ R for A ∈ S .
3. If f is nonnegative and ∫ f dμ = 1 then ν is a probability measure since ν (A) ≥ 0 for A ∈ S and ν (S) = 1 .

In case 3, f is the probability density function of ν relative to μ , our favorite kind of density function. When they exist,
density functions are essentially unique.

Suppose that ν is a σ-finite measure on (S, S ) and that ν has density function f with respect to μ . Then g : S → R is
a density function of ν with respect to μ if and only if f = g almost everywhere on S with respect to μ .

The essential uniqueness of density functions can fail if the positive measure space (S, S , μ) is not σ-finite. A simple
example is given below. Our next result answers the question of when a measure has a density function with respect to μ ,
and is the fundamental theorem of this section. The theorem is in two parts: Part (a) is the Lebesgue decomposition
theorem, named for our old friend Henri Lebesgue. Part (b) is the Radon-Nikodym theorem, named for Johann Radon and
Otto Nikodym. We combine the theorems because our proofs of the two results are inextricably linked.

Suppose that ν is a σ-finite measure on (S, S ).

1. Lebesgue Decomposition Theorem. ν can be uniquely decomposed as ν = νc + νs where ν c ≪ μ and ν
s ⊥μ .
2. Radon-Nikodym Theorem. ν has a density function with respect to μ .


In particular, a measure ν on (S, S ) has a density function with respect to μ if and only if ν ≪ μ . The density function in
this case is also referred to as the Radon-Nikodym derivative of ν with respect to μ and is sometimes written in derivative
notation as dν /dμ. This notation, however, can be a bit misleading because we need to remember that a density function is
unique only up to a μ -null set. Also, the Radon-Nikodym theorem can fail if the positive measure space (S, S , μ) is not
σ-finite. A couple of examples are given below. Next we characterize the Hahn decomposition and the Jordan

decomposition of ν in terms of the density function.

Suppose that ν is a measure on (S, S ) with ν ≪ μ , and that ν has density function f with respect to μ . Let
= {x ∈ S : f (x) ≥ 0} , and let f and f denote the positive and negative parts of f .
+ −

1. A Hahn decomposition of ν is (P , P ). c

2. The Jordan decomposition is ν = ν − ν + − where ν + (A) =∫

dμ and ν
− (A) =∫

dμ , for A ∈ S .

The following result is a basic change of variables theorem for integrals.

Suppose that ν is a positive measure on (S, S ) with ν ≪ μ and that ν has density function f with respect to μ . If
g : S → R is a measurable function whose integral with respect to ν exists, then

∫ g dν = ∫ gf dμ (3.13.6)


In differential notation, the change of variables theorem has the familiar form dν = f dμ , and this is really the justification
for the derivative notation f = dν /dμ in the first place. The following result gives the scalar multiple rule for density

Suppose that ν is a measure on (S, S ) with ν ≪ μ and that ν has density function f with respect to μ . If c ∈ R , then
cν has density function cf with respect to μ .


Of course, we already knew that ν ≪ μ implies cν ≪ μ for c ∈ R , so the new information is the relation between the
density functions. In derivative notation, the scalar multiple rule has the familiar form
d(cν ) dν
=c (3.13.9)
dμ dμ

The following result gives the sum rule for density functions. Recall that two measures are of the same type if neither takes
the value ∞ or if neither takes the value −∞ .

Suppose that ν and ρ are measures on (S, S ) of the same type with ν ≪ μ and ρ ≪ μ , and that ν and ρ have density
functions f and g with respect to μ , respectively. Then ν + ρ has density function f + g with respect to μ .

Of course, we already knew that ν ≪ μ and ρ ≪ μ imply ν + ρ ≪ μ , so the new information is the relation between the
density functions. In derivative notation, the sum rule has the familiar form
d(ν + ρ) dν dρ
= + (3.13.11)
dμ dμ dμ

The following result is the chain rule for density functions.

Suppose that ν is a positive measure on (S, S ) with ν ≪ μ and that ν has density function f with respect to μ .
Suppose ρ is a measure on (S, S ) with ρ ≪ ν and that ρ has density function g with respect to ν . Then ρ has density
function gf with respect to μ .

Of course, we already knew that ν ≪ μ and ρ ≪ ν imply ρ ≪ μ , so once again the new information is the relation
between the density functions. In derivative notation, the chan rule has the familiar form
dρ dρ dν
= (3.13.12)
dμ dν dμ

The following related result is the inverse rule for density functions.

Suppose that ν is a positive measure on (S, S ) with ν ≪ μ and μ ≪ ν (so that ν ≡μ ). If ν has density function f

with respect to μ then μ has density function 1/f with respect to ν .


In derivative notation, the inverse rule has the familiar form

dμ 1
= (3.13.13)
dν dν /dμ

Examples and Special Cases

Discrete Spaces
Recall that a discrete measure space (S, S , #) consists of a countable set S with the σ-algebra S = P(S) of all subsets
of S , and with counting measure #. Of course # is a positive measure and is trivially σ-finite since S is countable. Note
also that ∅ is the only set that is null for #. If ν is a measure on S , then by definition, ν (∅) = 0 , so ν is absolutely
continuous relative to μ . Thus, by the Radon-Nikodym theorem, ν can be written in the form

ν (A) = ∑ f (x), A ⊆S (3.13.14)


for a unique f : S → R . Of course, this is obvious by a direct argument. If we define f (x) = ν {x} for x ∈ S then the
displayed equation follows by the countable additivity of ν .

Spaces Generated by Countable Partitions

We can generalize the last discussion to spaces generated by countable partitions. Suppose that S is a set and that
A = { A : i ∈ I } is a countable partition of S into nonempty sets. Let S = σ(A ) and recall that every A ∈ S has a

unique representation of the form A = ⋃ A where J ⊆ I . Suppse now that μ is a positive measure on S with
j∈J j

0 < μ(A ) < ∞ for every i ∈ I . Then once again, the measure space (S, S , μ) is σ-finite and ∅ is the only null set.

Hence if ν is a measure on (S, S ) then ν is absolutely continuous with respect to μ and hence has unique density function
f with respect to μ :

ν (A) = ∫ f dμ, A ∈ S (3.13.15)


Once again, we can construct the density function explicitly.

In the setting above, define f : S → R by f (x) = ν (A i )/μ(Ai ) for x ∈ A and i ∈ I . Then f is the density of ν with

respect to μ .

Often positive measure spaces that occur in applications can be decomposed into spaces generated by countable partitions.
In the section on Convergence in the chapter on Martingales, we show that more general density functions can be obtained
as limits of density functions of the type in the last theorem.

Probability Spaces
Suppose that (Ω, F , P) is a probability space and that X is a random variable taking values in a measurable space (S, S ).
Recall that the distribution of X is the probability measure P on (S, S ) given by

PX (A) = P(X ∈ A), A ∈ S (3.13.17)

If μ is a positive measure, σ-finite measure on (S, S ), then the theory of this section applies, of course. The Radon-
Nikodym theorem tells us precisely when (the distribution of) X has a probability density function with respect to μ : we
need the distribution to be absolutely continuous with respect to μ : if μ(A) = 0 then P (A) = P(X ∈ A) = 0 for

A ∈ S .

Suppose that r : S → R is measurable, so that r(X) is a real-valued random variable. The integral of r(X) (assuming that
it exists) is of fundamental importance, and is knowns as the expected value of r(X). We will study expected values in
detail in the next chapter, but here we just note different ways to write the integral. By the change of variables theorem in
the last section we have

∫ r[X(ω)]dP(ω) = ∫ r(x)dPX (x) (3.13.18)


Assuming that P , the distribution of X, is absolutely continuous with respect to μ , with density function f , we can add to

our chain of integrals using Theorem (14):

∫ r[X(ω)]dP(ω) = ∫ r(x)dPX (x) = ∫ r(x)f (x)dμ(x) (3.13.19)


Specializing, suppose that (S, S , #) is a discrete measure space. Thus X has a discrete distribution and (as noted in the
previous subsection), the distribution of X is absolutely continuous with respect to #, with probability density function f
given by f (x) = P(X = x) for x ∈ S . In this case the integral simplifies:

∫ r[X(ω)]dP(ω) = ∑ r(x)f (x) (3.13.20)


Recall next that for n ∈ N , the n -dimensional Euclidean measure space is (R , R , λ ) where R is the σ-algebra of
n n n

Lebesgue measurable sets and λ is Lebesgue measure. Suppose now that S ∈ R and that S is the σ-algebra of
n n

Lebesgue measurable subsets of S , and that once again, X is a random variable with values in S . By definition, X has a
continuous distribution if P(X = x) = 0 for x ∈ S . But we now know that this is not enough to ensure that the
distribution of X has a density function with respect to λ . We need the distribution to be absolutely continuous, so that if

λ (A) = 0 then P(X ∈ A) = 0 for A ∈ S . Of course λ {x} = 0 for x ∈ S , so absolute continuity implies continuity,
n n

but not conversely. Continuity of the distribution is a (much) weaker condition than absolute continuity of the distribution.
If the distribution of X is continuous but not absolutely so, then the distribution will not have a density function with
respect to λ .

For example, suppose that λ (S) = 0 . Then the distribution of X and λ are mutually singular since P(X ∈ S) = 1 and
n n

so X will not have a density function with respect to λ . This will always be the case if S is countable, so that the

distribution of X is discrete. But it is also possible for X to have a continuous distribution on an uncountable set S ∈ R n

with λ (S) = 0 . In such a case, the continuous distribution of X is said to be degenerate. There are a couple of natural

ways in which this can happen that are illustrated in the following exercises.

Suppose that Θ is uniformly distributed on the interval [0, 2π). Let X = cos Θ , Y = sin Θ .

1. (X, Y ) has a continuous distribution on the circle C = {(x, y) : x 2
= 1} .
2. The distribution of (X, Y ) and λ are mutually singular.

3. Find P(Y > X) .


The last example is artificial since (X, Y ) has a one-dimensional distribution in a sense, in spite of taking values in 2
R .
And of course Θ has a probability density function f with repsect λ given by f (θ) = 1/2π for θ ∈ [0, 2π).

Suppose that X is uniformly distributed on the set {0, 1, 2}, Y is uniformly distributed on the interval [0, 2], and that
X and Y are independent.

1. (X, Y ) has a continuous distribution on the product set S = {0, 1, 2} × [0, 2].
2. The distribution of (X, Y ) and λ are mutually singular.

3. Find P(Y > X) .


The last exercise is artificial since X has a discrete distribution on {0, 1, 2} (with all subsets measureable and with #), and
Y a continuous distribution on the Euclidean space [0, 2] (with Lebesgue mearuable subsets and with λ ). Both are

absolutely continuous; X has density function g given by g(x) = 1/3 for x ∈ {0, 1, 2} and Y has density function h
given by h(y) = 1/2 for y ∈ [0, 2]. So really, the proper measure space on S is the product measure space formed from
these two spaces. Relative to this product space (X, Y ) has a density f given by f (x, y) = 1/6 for (x, y) ∈ S.
It is also possible to have a continuous distribution on S ⊆ R with λ (S) > 0 , yet still with no probability density

function, a much more interesting situation. We will give a classical construction. Let (X , X , …) be a sequence of
1 2

Bernoulli trials with success parameter p ∈ (0, 1). We will indicate the dependence of the probability measure P on the
parameter p with a subscript. Thus, we have a sequence of independent indicator variables with
Pp (Xi = 1) = p, Pp (Xi = 0) = 1 − p (3.13.21)

We interpret X as the ith binary digit (bit) of a random variable X taking values in (0, 1). That is, X = ∑ X /2 .
i i=1 i

Conversely, recall that every number x ∈ (0, 1) can be written in binary form as x = ∑ x /2 where x ∈ {0, 1} for

i=1 i

each i ∈ N . This representation is unique except when x is a binary rational of the form x = k/2 for n ∈ N and

k ∈ {1, 3, … 2 − 1} . In this case, there are two representations, one in which the bits are eventually 0 and one in which

the bits are eventually 1. Note, however, that the set of binary rationals is countable. Finally, note that the uniform
distribution on (0, 1) is the same as Lebesgue measure on (0, 1).

X has a continuous distribution on (0, 1) for every value of the parameter p ∈ (0, 1). Moreover,
1. If p, q ∈ (0, 1) and p ≠ q then the distribution of X with parameter p and the distribution of X with parameter q
are mutually singular.
2. If p = , X has the uniform distribution on (0, 1).

3. If p ≠ , then the distribution of X is singular with respect to Lebesgue measure on (0, 1), and hence has no

probability density function in the usual sense.


For an application of some of the ideas in this example, see Bold Play in the game of Red and Black.

The essential uniqueness of density functions can fail if the underlying positive measure μ is not σ-finite. Here is a trivial

Suppose that S is a nonempty set and that S = {S, ∅} is the trivial σ-algebra. Define the positive measure μ on
(S, S ) by μ(∅) = 0 , μ(S) = ∞ . Let ν denote the measure on (S, S ) with constant density function c ∈ R with

respect to μ .
1. (S, S , μ) is not σ-finite.
2. ν = μ for every c ∈ (0, ∞).

The Radon-Nikodym theorem can fail if the measure μ is not σ-finite, even if ν is finite. Here are a couple of standard

Suppose that S is an uncountable set and S is the σ-algebra of countable and co-countable sets:
S = {A ⊆ S : A is countable or A  is countable} (3.13.23)

As usual, let # denote counting measure on S , and define ν on S by ν (A) = 0 if A is countable and ν (A) = 1 if
A is countable. Then

1. (S, S , #) is not σ-finite.

2. ν is a finite, positive measure on (S, S ).
3. ν is absolutely continuous with respect to #.
4. ν does not have a density function with respect to #.

Let R denote the standard Borel σ-algebra on R . Let # and λ denote counting measure and Lebesgue measure on
(R, R) , respectively. Then
1. (R, R, #) is not σ-finite.
2. λ is absolutely continuous with respect to #.
3. λ does not have a density function with respect to #.

