Nothing Special   »   [go: up one dir, main page]

Chapter 6

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Chapter Six:Information Theory

Adama Science and Technology University

January 10, 2022

(ASTU) Digital Communication January 10, 2022 1 / 34


Outlines

1 Introduction

2 Information sources

3 Source Coding

4 Entropy Coding

(ASTU) Digital Communication January 10, 2022 2 / 34


Introduction

Performance of A Communication System


Effectiveness(validity)
Reliability
Effectiveness(validity)
Bandwidth
Transmission Rate
Information Transmission Rate Rb (bps): measured in bits per second
Symbol Transmission Rate RB (Bauds): measured in symbols per
second or Bauds

Rb = RB log2M , M = 2N (1)

Example:
Transmit 7 symbols in 1 second
{Red, Orange, Yellow , Green, Cyan, Blue, Purple}
RB = 7Bauds

(ASTU) Digital Communication January 10, 2022 3 / 34


Cont. . . .

Using Rb
{000, 001, 010, 011, 100, 101, 110}
Rb = 7 ∗ 3 = 21bps
If the information transmission rate of M-ary PSK is 1500bps, M=8.
Determine the symbol transmission rate.
If the symbol transmission rate of QPSK is 3000 Bauds. Determine the
rate of the information transmission.
Bandwidth Efficiency
Bit rate ; nb = Rb /B measured in bps/Hz
Bit rate; nB = RB /B measured in bps/Hz

(ASTU) Digital Communication January 10, 2022 4 / 34


Cont. . . .
Reliability
Signal-to-noise ratio (SNR) Bit/Symbol Error Rate or Probability of
Error
Bit Error Rate (BER) Peb = Errorbits/Totalbits
Symbol Error Rate PeB = ErrorSymbols/TotalSymbols
Table of Comparison

(ASTU) Digital Communication January 10, 2022 5 / 34


Cont. . . .

(ASTU) Digital Communication January 10, 2022 6 / 34


Information Sources

What is Information?
An information source may be viewed as an object which produces an
event, the outcome of which is selected at random according to a
probability distribution.
A discrete information source is a source which has only a finite set of
symbols as possible outputs.
The set of source symbols is called the source alphabet, and the
elements of the set are called symbols or letters.
Information sources can be classified as having memory or being
memoryless

(ASTU) Digital Communication January 10, 2022 7 / 34


Cont. . . .

A source with memory is one for which a current symbol depends on


the previous symbols.
A memoryless source is one for which each symbol produced is
independent of the previous symbols.

(ASTU) Digital Communication January 10, 2022 8 / 34


Cont. . . .

Analog Information Sources


Communication systems are designed to transmit the information
generated by a source to some destination.
The output of these sources are analog signals and therefore, these
sources are called as ‘Analog sources’.
Discrete Information Sources
If source of information is a digital computer or a storage device such
as magnetic tape or optical disk then their output is in the discrete
form such as binary or ASCII.
Sources producing a discrete output are known as the discrete sources.

(ASTU) Digital Communication January 10, 2022 9 / 34


Cont. . . .

Discrete Memoryless Sources (DMS)


Output sequences from the source are statistically independent
(Current output does not depend on any of the past outputs)
Characterized by the list of the symbols, the probability assignment to
these symbols, and the specification of the rate of generating these
symbols by the source
As an example, if a binary source generates a random sequence
101101000... such that each binary digit is statistically independent of
all the other digits
Stationary Sources
If the output of a discrete source is statistically dependent on its past
or future outputs.
The example of a discrete source is the source generating the English
text.

(ASTU) Digital Communication January 10, 2022 10 / 34


Mathematical Models for the Information Sources
The most important property of an information source is that it
produces an output that is random (which cannot be predicted
exactly).
The simplest-type of discrete source which is a binary source
transmitting a binary sequence in the form 11001011... .
The transmitted signal consists of only two alphabets {0, 1}.
A discrete information source with an alphabet of m possible letters
say {x1 , x2 . . . xm }.
This source transmits a sequence of letters selected from these letters.
For a discrete source in the set {X1 , X2 , . . . Xm } has a probability of
occurrence equal to P(xi )
A discrete source of information mathematically become;
m
X
P(xi ) = P(X = xi ), 1 ≤ i ≤ L and P(xi ) (2)
i=1

(ASTU) Digital Communication January 10, 2022 11 / 34


Cont. . . .

Logarithmic measure of information


Consider two discrete random variables X and Y such that

X = {x1 , x2 , . . . xn } and Y = {y1 , y2 , . . . ym } (3)

Suppose we observe Y = yj and wish to determine, quantitatively, the


amount of information Y = yj provides about the event X = xi ,
i = 1, 2, . . . n
Note:
If X and Y are statistically independent Y = yj provides no
information about the occurrence of X = xi
If they are fully dependent Y = yj determines the occurrence of X = xi
The information content is simply that provided by X = xi
A suitable measure that satisfies these conditions is the logarithm of
the ratio of the conditional probabilities
P{X = xi |Y = yj } = P{xi |yj } and P{X = xi } = p(xi )

(ASTU) Digital Communication January 10, 2022 12 / 34


Cont. . . .
Information content provided by the occurrence of Y = yj about the
event X = xi is defined as
P(xi /yj )
I (xi , yj ) = log
P(xi )

I (xi , yj ) is called the mutual information between xi and yj


Unit of the information measure is the nat(s) if the natural logarithm
is used and bit(s) if base 2 is used
Note ln a = ln2. log2 a = 0.69315 log2 a
If X and Y are independent I (xi , yj ) = 0
If the occurrence of Y = yj uniquely determines the occurrence of
X = xi , I (xi , yj ) = log (1/p(xi ))
i.e., I (xi , yj ) = − log p(xi ) = I (xi ) – self information of event X
Note : A high probability event conveys less information than a low
probability event
(ASTU) Digital Communication January 10, 2022 13 / 34
Cont. . . .

For a single event x where p(x) = 1, I (x) = 0 the occurrence of a


sure event does not convey any information

p(xi |yj ) p(xi |yj )p(yj ) p(xi , yj )


I (xi , yj ) = log = log = log
p(xi ) p(xi )p(yj ) p(xi )p(yj )

And since
p(xi , yj )
p(yj /xi ) =
p(xi )

p(yj |xi )
I (xi , yj ) = = I (yj , xi )
p(yj )
Information provided by yj about xi is the same as that provided by xi
about Y = yj

(ASTU) Digital Communication January 10, 2022 14 / 34


Cont. . . .

Using the same procedure as before we can also define conditional


self-information as

1
I (xi |yj ) = log2 = −log2 p(xi |yj )
p(xi |yj )

Now consider the following relationships

I (xi , yj ) = I (xi ) − I (xi |yj )

Which leads to

p(xi |yj )p(xi )


log = I (xi , yj ) − I (xi )
p(xi )

(ASTU) Digital Communication January 10, 2022 15 / 34


Cont. . . .

Properties of I (xi )
The information content of a symbol xi , denoted by I (xi ) satisfies the
following properties
I (xi ) = 0 for P(xi ) = 1
I (xi ) ≥ 0
I (xi ) > I (xj ) if P(xi ) < P(xj )
I (xi , xj ) = I (xi ) + I (xj ) if xi and xj are independent
I (xi ) must satisfy the following requirements:
I (xi ) must approach 0 as P(xi ) approaches infinity
The information content I (xi ) must be a non-negative quantity since
each message contains some information. In the worst case, I (xi ) can
be equal to zero
The information content of a message having higher probability of
occurrence is less than the information content of a message having
lower probability

(ASTU) Digital Communication January 10, 2022 16 / 34


Cont. . . .

Exercise -1:
1. A source produces one of four possible symbols during each interval
having probabilities; P(x1 ) = 0.5, P(x1 ) = 0.25, P(x1 ) = 0.125.
Obtain the information content of each of these symbols.
2. Calculate the amount of information if it is given that P(xi ) = 0.25
3. In a binary PCM if ’0’ occur with probability = 0.25 and ’1’ occur with
probability equal to 0.75. Then calculate the amount of information
carried by each digits

(ASTU) Digital Communication January 10, 2022 17 / 34


Cont. . . .
Average Mutual Information
Average mutual information between X and Y is given by:

n X
m n X
m
X X p(xi , yj )
I (X ; Y ) = p(xi , yj )I (xi , yj ) = p(xi , yj ) log2 ≥0
p(xi )p(yj )
i=1 j=1 i=1 j=1

(4)
Similarly, average self-information is given by:

n
X n
X
H(X ) = p(xi )I (xi ) = − p(xi ) log2 p(xi ) (5)
i=1 i=1

Where X represents the alphabet of possible output letters from a


source.
H(X ) represents the average self information per source letter and is
called the entropy of the source.
Entropy is defined as the average information per message.
(ASTU) Digital Communication January 10, 2022 18 / 34
Cont. . . .

If p(xi ) = 1/n for all i, then the entropy of the source becomes.

X1 1
H(X ) = log2 = log2 n
n n

In general,H(X ) ≤ logn for any given set of source letter probabilities.


The entropy of a discrete source is maximum when the output letters
are all equally probable.
Conditional self-information or conditional entropy as follows:

n X
m
X 1
H(X /Y ) = p(xi , yj ) log2 (6)
p(xi /yj )
i=1 j=1

(ASTU) Digital Communication January 10, 2022 19 / 34


Cont. . . .

Conditional entropy is the information or uncertainty in X after Y is


observed.
From the definition of average mutual information one can show that:

n X
X m
I (X , Y ) = p(xi , yj ) log p(xi /yj ) − log p(xi )
i=1 j=1

= −H(X /Y ) + H(X ) (7)

Thus I (X , Y ) = H(X )–H(X /Y ) and H(X ) ≥ H(X |Y ), with equality


when X and Y are independent

(ASTU) Digital Communication January 10, 2022 20 / 34


Cont. . . .

H(X /Y ) is called the equivocation


It is interpreted as the amount of average uncertainty remaining in X
after observation of Y
H(x) – Average uncertainty prior to observation
I (X ; Y ) – Average information provided about the set X by the
observation of the set Y
Information Rate
If the time rate at which source X emits symbols is r (symbols/s), the
information rate R(b/s) of the source is:

R = rH(X ) (8)

Where R is information rate (Information bits/second) .


H(X ) is entropy or average information
and r is rate at which symbols/second are generated.

(ASTU) Digital Communication January 10, 2022 21 / 34


Cont. . . .

Exercise-2:
1. A DMS X has four symbols x1 , x2 , x3 , x4 with probabilities;
P(x1 ) = 0.4, P(x2 ) = 0.3, P(x3 ) = 0.2, P(x4 ) = 0.1. Calculate
a. H(X ).
b. Find the amount of information contained in the messages x1 , x2 , x3
and x4 x3 x3 x2 , and Compare with the H(X ) obtained in part (a).
2 For a source transmitting two independent messages m1 and m2 having
probabilities of p and (1–p) respectively, prove that the entropy is
maximum when both the messages are equally likely. Also, plot the
variation of entropy (H) as a function of probability (p) of one of the
messages.
A discrete source emits one of five symbols once every millisecond with
probabilities 1/2, 1/4, 1/8, 1/16 and 1/16 respectively. Determine the
source entropy and information rate.

(ASTU) Digital Communication January 10, 2022 22 / 34


Source Coding

An objective of source coding is to minimize the average bit rate


required for representation of the source by reducing the redundancy
of the information source.
A conversion of the output of a discrete memoryless source (DMS)
into a sequence of binary symbols (i.e., binary code word) is called
source coding .
The device that performs this conversion is called the source encoder

(ASTU) Digital Communication January 10, 2022 23 / 34


Cont. . . .

Terms Related to Source Coding Process


Code word Length : The number of binary digits in the code word
Average Code word Length (L) : The average number of bits per
source symbol in the source coding process
m
X
L= P(xi )ni
i=1

Code Efficiency (η) :


Lmin
η=
L

(ASTU) Digital Communication January 10, 2022 24 / 34


Cont. . . .

Code Redundancy (γ) :


γ =1−η
The source coding theorem states that ” For a DMS X,
with entropy H(X), the average code word length L per
symbol is bounded as

L ≥ H(X )

and further, L can be made as close to H(X) has desired


for some suitably chosen code. Thus with Lmin = H(X ),
the code efficiency can be written as

H(x)
Code Efficiency = × 100 (9)
L

(ASTU) Digital Communication January 10, 2022 25 / 34


Cont. . . .
Classification of Codes
Fixed-Length Codes : A fixed-length code is one whose codeword
length is fixed.
Code 1 and code 2 of Table below are fixed-length with
length 2.
Variable-Length Codes : A variable-length code is one whose
codeword length is not fixed.
All codes of Table below except codes 1 and 2 are
variable-length codes.
Distinct codes : A code is distinct if each codeword is distinguishable
from other codewords.
All codes of Table below except code 1 are distinct
codes – notice the codes for x1 and x3.
Prefix-Free Codes :A code in which no codeword can be formed by
adding code symbols to another codeword is called a
prefix-free code.
(ASTU) Digital Communication January 10, 2022 26 / 34
Cont. . . .

Exercise -4: Given a DMS X with two symbols x1 and x2 and


P(x1 ) = 0.9, P(x2 ) = 0.1. Symbols x1 and x2 are encoded as follows:

Find the efficiency and the redundancy of this code.

(ASTU) Digital Communication January 10, 2022 27 / 34


Entropy Coding

The design of a variable-length code such that its average codeword


length approaches the entropy of DMS is often referred to as entropy
coding.
In this section, we present two examples of entropy coding.
Shannon-Fano coding and
Huffman coding

(ASTU) Digital Communication January 10, 2022 28 / 34


Shannon-Fano Coding algorithm
Shannon-Fano showed: “To reliably store the information generated
by some random source X , you need no more/less than, on the
average, H(X ) bits for each outcome”
Using ASCII representation, computer needs 8bits = 1byte for storing
each outcome. The resulting file has size 8,000,000 bits.
Shannon-Fano Coding algorithm
1 List the source symbols in order of decreasing probability.
2 Partition the set into two sets that are as close to equiprobables as
possible, and assign 0 to the upper set 1 to the lower set.
3 Continue this process, each time partitioning the sets with as nearly
equal probabilities as possible until further partitioning is not possible.

(ASTU) Digital Communication January 10, 2022 29 / 34


Huffman Coding Algorithm

1 List the source symbols in order of decreasing probability.


2 Combine the probabilities of the two symbols having the lowest
probabilities, and reorder the resultant probabilities, this step is called
reduction 1. The same procedure is repeated until there are two
ordered probabilities remaining.
3 Start encoding with the last reduction, which consist of exactly two
ordered probabilities. Assign 0 as the first digit in the codewords for all
the source symbols associated with the first probability; assign 1 to the
second probability.
4 Now go back and assign 0 and 1 to the second digit for the two
probabilities that were combined in the previous reduction step,
retaining all assignments made in Step 3.
5 Keep regressing this way until the first column is reached.

(ASTU) Digital Communication January 10, 2022 30 / 34


Cont. . . .

Example: Given a DMS with seven source letters x1 , x2 , . . . , x7 with


probabilities 0.35, 0.30, 0.20, 0.10, 0.04, 0.005, 0.005, respectively.
Order the symbols in decreasing order of probabilities.

(ASTU) Digital Communication January 10, 2022 31 / 34


Cont. . . .

(ASTU) Digital Communication January 10, 2022 32 / 34


Cont. . . .

Exercise-5:
A Discrete Memoryless Source X has five equally likely symbols.
a. Construct a Shannon-Fano code for X, and calculate the efficiency of
the code.
b. Construct another Shannon-Fano code and compare the results.
c. Repeat for the Huffman code and compare the results.

(ASTU) Digital Communication January 10, 2022 33 / 34


Thank You

(ASTU) Digital Communication January 10, 2022 34 / 34

You might also like