DSP PDF
DSP PDF
DSP PDF
1. Introduction
Data compression is a crucial aspect of modern computing, enabling efficient storage and
transmission of information. Huffman coding is a widely used technique for lossless data
compression. In this report, we explore Huffman data compression, including its algorithm,
2. Problem Description
The goal of Huffman coding is to efficiently represent data by assigning variable-length codes
to input symbols based on their frequencies. Symbols with higher frequencies are assigned
3. Algorithm (Pseudocode)
1.Read the number of symbols and their corresponding probabilities.
3.Create a priority queue and insert the symbols and their probabilities.
4.Repeat the following steps until the priority queue contains only one element:
a. Remove the two elements with the lowest probabilities from the priority queue.
b. Create a new node with the sum of the two probabilities and the concatenation of their symbols.
5. The remaining element in the priority queue is the root of the Huffman tree.
This pseudocode highlights the user interaction for defining symbols and probabilities.
This focuses on the core logic of Huffman coding without user interaction details.
This pseudocode outlines the process of encoding a symbol sequence using a predefined Huffman
code dictionary.
START
Sort Probabilities in
3
Descending Order
Generate Huffman
4
Dictionary
Generate Random
5
Input Signal
Calculate
10
Sequence Length
Calculate Encoded
11
Length
12 Display Results
End
4. MATLAB Implementation
symbol') n=1:x;
disp(n)
disp(p)
s=sort(p,'descend')
disp(s)
dict= huffmandict(n,p);
bps = ceil(log2(max(n)));
inputsig= randsrc(100,1,[n;p]);
code = huffmanenco(inputsig,dict)
sig = huffmandeco(code,dict);
isequal(inputsig,sig)
binarySig = int2bit(inputsig,bps);
seqLen = numel(binarySig)
binaryComp = int2bit(code,bps);
encodedLen = numel(binaryComp)
Experimental Results: After applying Huffman coding, the average code length reduces, and the
shorter codes to more probable symbols and longer codes to less probable symbols.
THEORY
1. input('enter the number of symbol');
This function prompts the user to enter the number of symbols in your data. It captures the
user's input as a string and stores it in the variable x.
It's mandatory because you need to define the number of symbols you're working with to create
the Huffman code dictionary.
2. n = 1:x;
This creates a row vector n containing sequential integers from 1 to the value stored in x.
It's typically used to represent the symbols in your data. Here, it assumes your symbols are
simply numbered from 1 to x.
These lines use the disp function to display informative messages to the user.
The first line displays "the number of symbols are n:" on the console.
The second line displays the contents of the vector n, showing the user the defined symbols.
These are not mandatory for functionality but improve clarity and user interaction.
Similar to input, this prompts the user to enter the probabilities associated with each symbol. It
stores the user's input as a row vector in p.
The values in p should sum to 1 and represent the likelihood of each symbol appearing in your
data.
This is mandatory because Huffman coding uses probabilities to create optimal codewords.
sort(p,'descend') sorts the elements in p in descending order and stores the result in s.
disp functions again display messages and the sorted probabilities for user reference.
Sorting probabilities can be helpful for visualization and understanding the distribution of symbol
occurrences, but it's not strictly necessary for Huffman encoding itself.
6. dict = huffmandict(n, p);
7. bps = ceil(log2(max(n)));
This calculates the minimum number of bits required to represent the symbols in binary
form using ceil(log2(max(n))).
log2 calculates the base-2 logarithm, and ceil rounds the result up to the nearest integer.
It's informative to know the minimum bits needed, but not crucial for encoding/decoding.
This encodes the random symbol sequence (inputsig) using the Huffman code dictionary
(dict).
huffmanenco is another MATLAB function that performs the encoding based on the
codewords in dict.
The internal process (not directly accessible) involves replacing each symbol in inputsig with
its corresponding Huffman codeword from dict.
The output, code, is a row vector containing the encoded data as a sequence of 0s and 1s.
This is mandatory for encoding data using Huffman coding
EXAMPLE : 1
EXAMPLE : 2
6. Conclusion
Huffman coding is a powerful technique for lossless data compression. This report presented an
overview of Huffman coding, its algorithm, implementation, and experimental results. Huffman coding
efficiently reduces the size of data by assigning shorter codes to frequently occurring symbols, thus
7. References