Data_structure_dessalew
Data_structure_dessalew
Data_structure_dessalew
Instructor: Dessalew G.
Haramaya University
Chapter One: Introduction
This Chapter Covers:
Primitive and Non-primitive Data structures
Algorithm Analysis:
Computational and asymptotic complexity
Best case, Worst Case, Average Case Analysis
B. sum=0;
for(i=1;i<=n;i++)
for(j=1;j<=i;j++)
sum++;
Algorithm Analysis Categories
• Algorithm must be examined under different situations to
correctly determine their efficiency for accurate
comparisons
• Best Case Analysis: Assumes that data are arranged in
the most advantageous order. It also assumes the
minimum input size.
• E.g.
• For sorting – the best case is if the data are arranged in the
required order.
• For searching – the required item is found at the first position.
• Note: Best Case computes the lower boundary of T(n)
• It causes fewest number of executions
Worst case analysis
• Assumes that data are arranged in the disadvantageous
order.
• It also assumes that the input size is infinite.
• Eg.
• For sorting – data are arranged in opposite required order
• For searching – the required item is found at the end of the
item or the item is missing
• It computes the upper bound of T(n) and causes maximum
number of execution.
Average case Analysis
Assumes that data are found in random order.
It also assumes random or average input size.
E.g.
For sorting – data are in random order.
For searching – the required item is found at any position
or missing.
It computes optimal bound of T(n)
It also causes average number of execution.
Best case and average case can not be used to estimate
(determine) complexity of algorithms
Worst case is the best to determine the complexity of
algorithms.
Order Of Magnitude
Order Of Magnitude refers to the rate at which the storage
or time grows as function of problem size (function (n)).
It is expressed in terms of its relationship to some known
functions – asymptotic analysis.
Asymptotic complexity of an algorithm is an
approximation of the computational complexity that holds
for large amounts of input data.
Deadline :
Finding Asymptotic Complexity:
Examples
Rules to find Big-O from a given T(n)
Take highest order
Ignore the coefficients
Reading Assignment
Big-OO notation
Amortized complexity
Finding Big O of given Algorithm
1. for (i=1;i<=n;i++)
Cout<<i;
T(n) = 1+n+1+n+n
=3n+2
T(n) = O(n)
…
2. For (i=1;i<=n;i++)
For(j=1;j<=n;j++)
Cout<<i;
T(n)=1+n+1+n+n(1+n+1+n+n)
=3n2 +4n+2
T(n)=O(n2)
…
3. for (i=1; i<=n; i++)
Cout<<i;
T(n) = 1+n+1+n+n
= 3n+2
T(n) = O(n)
Exercise
Find Big O of the following algorithm
1. for(i=1;i<=n;i++)
Sum=sum+i;
For (i=1;i<=n;i++)
For(j=1;j<=m;j++)
Sum++;
…
2. if(k==1)
{
For (i=1;i<=100;i++)
For (j=1;j<=1000;j++)
Cout<<I
}
Else if(k==2)
{
For(i=1;i<=n;i=i*2)
Cout<<i;
}
Else
{
{for
(i=1;i<=n;i++)
Sum++;
}
…
3. for (i=1; i<=n;i++)
{
If (i<10)
For(j=1;j<=n;j=j*2)
Cout<<j;
Else
Cout<<i;
}
…
4. for (int i=1; i<=N; ++i)
for (int j=i; j>0; --j)
{} // do nothing
5. for (int i=1; i<N; i=i*2)
{}
6. for (int i=0; i<N; ++i)
for (int j=i; j>0; j=j/2)
{}
…
List index 0 1 2 3 4 5 6 7
Data C A E F N Q R K
Key=Q after six comparison the algorithm returns found
Key=X after 8 comparison the algorithm returns found false
If any of these other elements is less than the current first
element then the first element is swapped with that
element.
The above step is repeated with the second, third and all
other subsequent elements.
Implementation:
void swap( dataType x, dataType y)
{
dataType temp;
temp=x;
x=y;
y=temp;
}
for (i=0;i<=n-2;i++)
for(j=i+1;j<=n-1;j++)
if(dataElement[i]>dataElement[j])
Swap(dataElement[i],dtataElement[j]);
Example
Implementation:
for (int i = 0; i < n-1; i++)
for (int j = n-1; j > i; --j)
if (data[j] < data[j-1])
swap(data[j],data[j-1]);
Example data element= 5,2,3,8,1
Analysis
Passes number of comparison number of swaps
1 n-1 n-1
2 n-2 n-2
. . .
. . .
. . .
n-1 1 1
total n(n-1)/2 n(n-1)/2
=O(n2)
Selection Sort Algorithm
It is in many ways similar to both simple and bubble
algorithm.
Rather than swapping the neighbors continuously as the
algorithm traverses the sub array to be sorted, as done in
the bubble sort case, this algorithm finds the minimum
element of the sub array and swaps it with the pivot
elements.
Implementation
74
Shell Sort
The shell sort, also known as the diminishing increment
sort, was developed by Donald L. Shell in 1959.
The idea behind shell sort is that it is faster to sort an
array if parts of it are already sorted.
The original array is first divided into a number of smaller
subarrays, these subarrays are sorted, and then they are
combined into the overall array and this is sorted.
77
…
78
In the above example:
79
In the above example, we used three iterations: a 5-
sort, a 3-sort and a 1-sort.
This sequence is known as the diminishing increment.
But how do we decide on this sequence of increments?
Unfortunately, there is no definitive answer to this
question. In the original algorithm proposed by
Donald L. Shell, powers of 2 were used for the
increments, e.g. 16, 8, 4, 2, 1.
However, this is not the most efficient technique.
Experimental studies have shown that increments
calculated according to the following conditions lead
to better efficiency:
h1 = 1
hi+1 = 3hi + 1
For example, for a list of length 10,000 the sequence of
increments would be 3280, 1093, 364, 121, 40, 13, 4, 1.
80
Another decision that must be made with shell sort is
what sorting algorithms should be used to sort the
subarrays at each iteration?
A number of different choices have been made: for
example, one technique is to use insertion sort for
every iteration and bubble sort for the last iteration.
Actually, whatever simple sorting algorithms are used
for the different iterations, shell sort performs better
than the simple sorting algorithms on their own.
It may seem that shell sort should be less efficient,
since it performs a number of different sorts, but
remember that most of these sorts are on small arrays,
and in most cases the data is already almost sorted.
An experimental analysis has shown that the
complexity of shell sort is approximately O(n1.25), which
is better than the O(n2) offered by the simple
algorithms. 81
Example2:
sort: (5, 8, 2, 4, 1, 3, 9, 7, 6, 0)
Solution:
4 – sort
Sort(5, 1, 6) – 1, 8, 2, 4, 5, 3, 9, 7, 6 , 0
Sort(8, 3, 0) – 1, 0, 2, 4, 5, 3, 9, 7, 6, 8
Sort(2, 9) – 1, 0, 2, 4, 5, 3, 9, 7, 6, 8
Sort(4,7) - 1, 0, 2, 4, 5, 3, 9, 7, 6, 8
1- sort
Sort(1, 0, 2, 4, 5, 3, 9, 7, 6, 8) – 0,1, 2, 3, 4, 5, 6, 7,8,9
82
Heap Sort
Heap is a binary tree that has two properties:
The value of each node is greater than or equal to the values stored
in each of its children.
The tree is perfectly balanced, and the leaves in the last level are
all in the leftmost positions. (filled from left to right)
A property of the heap data structure is that the largest
element is always at the root of the tree.
A common way of implementing a heap is to use an array
83
Heap Sort…
The heap sort works by first rearranging the input array so
that it is a heap, and then removing the largest elements
from the heap one by one. Pseudocode for the heap sort is
given below:
HeapSort(data):
Transform data into a heap
for (i = n-1; i > 1; i--)
Swap the root with element in position i
Restore the heap property for the tree data[0] … data[i-1]
84
Example
85
Quick Sort
Quicksort was developed by C. A. R. Hoare in 1961, and is
probably the most famous and widely used of sorting
algorithms.
The guiding principle behind the algorithm is similar to
that of shell sort: it is more efficient to sort a number of
smaller subarrays than to sort one big array
86
The Algorithm
In quicksort the original array is first divided into two
subarrays, the first of which contains only elements that
are less than or equal to a chosen element, called the
bound or pivot.
The second subarray contains elements that are greater
than or equal to the bound. If each of these subarrays is
sorted separately they can be combined into a final
sorted array.
To sort each of the subarrays, they are both subdivided
again using two new bounds, making a total of 4
subarrays.
The partitioning is repeated again for each of these
subarrays, and so on until the subarrays consist of a
single element each, and do not need to be 87
sorted
…
Quicksort is inherently recursive in nature, since it consists
of the same simple operation (the partitioning) applied to
successively smaller subarrays.
Pseudocode for quicksort is given below:
88
…
The pivot element can be selected in many ways. We
may select the first element as a pivot , or we may select
the middle element of the array or may be a randomly
selected pivot [bound] element can be used. The
following example demonstrates a pivot element selected
from the middle of the data. In the average, quicksort
has complexity of nlogn.
89
Example [ pivot selected from the middle]
90
In the above Example
To begin with, the middle element 6 is chosen as the
bound, and the array partitioned into two subarrays.
Each of these subarrays is then partitioned using the
bounds 2 and 10 respectively.
In the third phase, two of the four subarrays only have
one element, so they are not partitioned further.
The other two subarrays are partitioned once more,
using the bounds 3 and 7.
Finally we are left with only subarrays consisting of a
single element. At this point, the subarrays and bounds
are recombined successively, resulting in the sorted
array. 91
Merge Sort
Mergesort also works by successively partitioning the
array into two subarrays, but it guarantees that the
subarrays are of approximately equal size.
This is possible because in mergesort the array is
partitioned without regard to the values of their
elements: it is simply divided down the middle into
two halves.
Each of these halves is recursively sorted using the
same algorithm.
After the two subarrays have been sorted, they are
merged back together again.
The recursion continues until the subarrays consist of
a single element, in which case they are already sorted.
92
Pseudocode code for merge sort
93
Example
94