DSA Chap 1 PDF
DSA Chap 1 PDF
DSA Chap 1 PDF
Data Structure :
A data structure is a particular way of organizing data in a computer so that it can be
used efficiently. It is a way of collecting and organizing data in such a way that we can
perform operations on that data in an effective way.
For example, we can store a list of items having the same data-type using the array data
structure.
It represents the knowledge of data to be organized in memory. It should be designed &
implemented in such a way that it reduces the complexity and increases the efficiency.
Data structures are often classified by their characteristics. Some of those characteristics
are:
Linear or non-linear: This characteristic describes whether the data items are
arranged in chronological sequence, such as with an array, or in an unordered
sequence, such as with a graph.
Static or dynamic: This characteristic describes how the data structures are
compiled. Static data structures have fixed sizes, structures and memory locations
1
at compile time. Dynamic data structures have sizes, structures and memory
locations that can shrink or expand depending on the use.
Basic Terminologies :
Explaining or defining the basic terms used in data structures is called basic terminology.
The basic terms are described in brief as below :
1) Data : Data can be defined as value or collection of values, facts or figures. Data is
a raw and unorganized fact that is required to be processed to make it
meaningful.
For e.g.Marks of a student can be data.
3) Data Item :A Data Item refers to a single unit of values. A data item that does not
have subordinate data items i.e. that cannot be further sub-divided is called an
elementary item.
For e.g. Roll Number, Age etc.
4) Group Item :A data item that is composed of one or more subordinate data
items is called a group item, i.e. data item that can be further sub-divided is called
a group item.
For e.g. Name – First_Name, Middle_name, Last_name.
5) Entity :An entity is a real-world object that has certain attributes or properties
which may be assigned values.
For e.g. Student, Employee, Chalk, etc.
2
8) Record :A record is a collection of field values of a given entity i.e. A single row in a
table is called a record. It is also called a complete set of information of an entity.
For e.g. A student’s Roll_no, Name, Class.
9) File :File is the collection of records of the entities in a given entity set.A collection
of data or information that has a name, called the filename.
10) Primary Key :A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique values.
For e.g.Roll_No,etc.
3
Elementary Data Organization :
Organizing each data element at proper memory locations for efficient storage and
retrieval is called Elementary Data Organization. To organize data in an efficient way,
some data structures are used.
The different types of data structures used in a computer system are :
4
The size and type of variable values are not pre-specified, and it has additional
methods to perform certain operations.
Non-Primitive Data Structures are divided into two categories as Linear and
Non-Linear Data Structures, as :
i. Linear Data Structures :Linear data structure arranges the data into a
sequence and follow some sort of order.
In a linear data structure, data elements are arranged in a linear order
where each member element is connected to its previous and next
element.
Such data structures are easy to implement as computer memory is also
sequential. In a linear data structure, memory is not utilized in an efficient
way.
Linear Data Structures are further sub-divided into two types as Static
and Dynamic, as :
Graphicalrepresentation :
2. Linked List :A linked list is a linear data structure, in which the elements are not
stored at contiguous memory locations. The elements in a linked list are linked
using pointers.
Linked list can be visualized as a chain of nodes, where every node points to the
next node
Each node is divided into two parts :
First part contains the data element i.e. Information/Data field and second part
contains the address of next node i.e. Next Pointer Field. The node that contains
the Null Pointer indicates the end of list, as :
6
The entry point into a linked list is called the HEAD node or START node of the list.
It is the reference (pointer) to the first node.
3. Stack :Stack is an ordered list of similar data type. It is a linear list in which
Insertion and Deletion takes place only at one end, called TOP.
Stack is also called as LIFO [Last In First Out] or FILO [First In Last Out] data
structure.
For e.g. Stack of Dishes, Books, Cards, etc.
It is a simple data structure that allows adding and removing elements in a
particular order. Every time an element is added, it goes on the top of the stack
and the only element that can be removed is the element that is at the top of the
stack, just like a pile of objects.
Two pre-defined functions are used to insert and remove items from the stack, as:
push() is used to insert a new element in the stack, pop() is used to delete an
existing element from the list.
The simplest application of a stack is to reverse a word. You push a given word to
stack - letter by letter - and then pop letters from the stack.
4. QUEUE :A queue is a linear collection of objects that are inserted and removed
according to the First In First Out (FIFO) principle.
7
Queue is a linear data structure where the first element is inserted from one end
called REAR and deleted from the other end called FRONT.
According to its FIFO structure, element inserted first will also be removed first.
The pre-defined functions to add an element into queue is called enqueue() and to
remove an element from queue is called dequeue().
Front pointer is used to get the first data item from a queue.Rear pointer is used
to get the last item from a queue.
The above figure represents structure of a tree. This tree has 2 subtrees, as :
A is a parent of B and C.
B is called a child of A and also parent of D, E, F.
8
6. GRAPH :A Graph is a non-linear data structure consisting of nodes and edges. A
graph G can be defined as an ordered set G(V, E) where, G = Graph, V = set of
Vertices[Nodes] & E = set of Edges[To connect nodes].
9
Data Structure Operations :
The data in the data structures are processed by certain operations. The basic operations
that are performed on data structures are as follows:
3. Inserting :Inserting is the process to insert one or more data elements into adata
structure. Based on the requirement, new element can be added at the beginning,
end or any given index.
10
Concept and Notations of Algorithm :
Algorithm is a list of well defined finite steps to solve a particular problem.
Following are some of the conventions used in algorithms :
1.) Identifying the number :
Each algorithm is assigned a number such as ‘Algorithm 1.1’ that
indicates first algorithm from chapter 1.
11
Algorithm Complexity :
Complexity of an algorithm is a measure of the amount of time and/or space required by
an algorithm for an input of a given size (n).
Algorithm complexity is a rough approximation of the number of steps, which will be
executed depending on the size of the input data. Complexity gives the order of steps
count, not their exact count.
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
Space Factor − Space is measured by counting the maximum memory space
required by the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space
required by the algorithm in terms of n as the size of input data.
1. Time Complexity : Time complexity of an algorithm represents the amount of
time required by the algorithm to run to completion. Time requirements can be
defined as a numerical function T(n), where T(n) can be measured as the number of
steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the
total computational time is T(n) = c ∗ n, where c is the time taken for the addition of
two bits. Here, we observe that T(n) grows linearly as the input size increases.
2. Space Complexity : Space complexity of an algorithm represents the amount of
memory space required by the algorithm in its life cycle. The space required by an
algorithm is equal to the sum of the following two components −
A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and
constants used, program size, etc.
A variable part is a space required by variables, whose size depends on the size of
the problem. For example, dynamic memory allocation, recursion stack space,
etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the
fixed part and S(I) is the variable part of the algorithm, which depends on instance
characteristic I. Following is a simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 - START
12
Step 2 - C← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now,
space depends on data types of given variables and constant types and it will be
multiplied accordingly.
13
Linear Array and its representation in Memory :
A linear array, is a list of finite homogeneous data elements such that –
i. Elements of the array are referenced by consecutive index numbers.
ii. Elements of the array stored respectively at continuous memory locations.
The total number of elements in the array is the length of an array. It can be calculated
using the formula :
Length = UB – LB + 1
Where, UB = Upper Bound(Largest index number of array) & LB = Lower Bound(Smallest
index number of array).
An array can be declared as :
int x[5] = {10,20,30,40,50}; (Declaration & Initialization)
OR
int x[5]; (Declaration)
Such that
X [0] = 10 X [1] = 20 X [2] = 30 X [3] = 40 X [4] = 50
14
- Linear Array representation in Memory
Let ‘LA’ be the linear array in the memory of a computer as shown below :
The elements of ‘LA’ are stored at successive memory cells. A Computer does not keep
track of every element in the array, but needs to keep track of address of only first
element called the base address denoted by BASE(LA).
By using this base address, computer calculates the address of any element of array
using the following formula :
LOC( LA ( 3 ) ) = 1000 + 2 ( 3 – 0 )
= 1000 + 2 ( 3 )
= 1000 + 6
= 1006
Therefore, LOC( LA ( 3 ) ) = 1006
15
Traversing in Linear Array :
Traversing refers to visiting or processing each element at least once. Traversal is done
by starting with the first element of the array and reaching to the last.
For e.g. To print all the elements of an array, To count the number of elements of the
array.
Algorithm : Traverse LA(LA, LB, UB)
Description : Here, Consider ‘LA’ is the Linear array with Lower Bound ‘LB’ and Upper
Bound ‘UB’. This algorithm traverses array LA applying an operation ‘PROCESS’ to each
element of ‘LA’ .
OR
Step 1 : Repeat for K= LB to UB
Step 2 : Apply PROCESS to LA[K]
[End of Loop].
Step 3 : Exit
16
Write a program to traverse & print all the elements of a linear array.
#include<stdio.h>
#include<conio.h>
void main()
{
int LA[5];
clrscr();
for(int k=0;k<=4;k++)
{
printf(“%d”,LA[k]);
}
getch();
}
17
Step 1 : [Initialize counter k with index of last element] Set k :=N-1
Step 2 : While k>=LOC repeat steps 3 and 4
Step 3 : [Move the current element one position backwards]
DATA[k+1] :=DATA[k]
Step 4 : [Decrement counter k] Set k :=k-1
Step 5 : [Insert new element at the Location] DATA[LOC] :=ITEM
Step 6 : [Update total number of array elements] Set N:=N+1
18
Deletion in Linear Array :
Deletion refers to removing an existing element from the array and re-organizing all
elements of an array.
Deletion in Linear Array involves three cases, as :
1. At the beginning
2. At a given location
3. At the end
Delete the element and shift the rest of the elements to left by one position.
Algorithm : DeleteLA (DATA, N, ITEM, LOC)
Description : This algorithm deletes an element at given position in a linear array DATA
with N elements and stores in ITEM.
When LOC=1/0 it means the element to be deleted is at the beginning
When LOC =N it means the element to be deleted is at the end
When LOC = N-1 it means the element to be deleted is at J-th (given) location.
19
Searching Methods - Linear & Binary Search :
Searching in data structure refers to the process of finding location LOC of an element in
a list.
It is the process of finding a given value position in a list of values.
Any search is said to be successful or unsuccessful depending upon whether the element
that is being searched is found or not.
To search an element in a given array, two popular algorithms are available:
If we need to search the element 20, it will go step by step in a sequential order, thus it is
also called sequential search.
21
In Binary Search, the Smallest index is called BEGINNING and the Largest index is called
END, while the MIDDLE element index is called MID.
To divide the array into two halves, i.e. to get the middle index value of array, a formula
is used as :
MID = INT((BEG + END)/2)
IF the ITEM to be searched is equal to the value of MID, LOC is set to MID, as :
If ITEM=DATA[MID] then
LOC=MID
IF the ITEM to be searched is greater than the value of MID, Set BEG = MID + 1, as :
If ITEM>DATA[MID] then
BEG = MID + 1
IF the ITEM to be searched is less than the value of MID, Set END=MID – 1, as :
If ITEM<DATA[MID] then
END=MID – 1
For e.g. Let DATA be the Linear Array with 10 elements.
DATA[10] = { 11,22,30,33,40,44,50,60,66,77 }
Find LOC of 44 using Binary Search?
DATA[10] = { 11,22,30,33,40,44,50,60,66,77 }
Solution : Here ITEM = 44,
Set BEG=1, End=10
MID = INT((BEG + END)/2)
= INT((1+10)/2)
= INT(11/2)
= INT(5.5)
MID = 5, thus DATA[5] = 40
Since ITEM > DATA[MID]
i.e. 44 > 40
Thus, BEG = MID + 1
= 5+1
BEG = 6
22
Now,
MID = INT((BEG + END)/2)
= INT(( 6+10 )/2)
= INT (16 / 2)
= INT(8)
MID = 8, thus DATA[8] = 60
Since ITEM < DATA[MID]
i.e. 44 < 60
Thus, END = MID - 1
= 8-1
END = 7
Now,
MID = INT((BEG + END)/2)
= INT(( 6+7 )/2)
= INT (13 / 2)
= INT(6.5)
MID = 6, thus DATA[6] = 44
Since ITEM = DATA[MID]
i.e. 44 = 44
Thus, LOC = MID
LOC = 6
The search is successful.
23
Algorithm : BinarySearch (DATA, LB, UB, ITEM, BEG, END, MID)
Description : This algorithm uses binary search method to search a given element ITEM
in the sorted linear array DATA by dividing the array into two parts until the element is
found.
Here, DATA is sorted array with lower bound LB & upper bound UB. The variables BEG,
END & MID denotes the Beginning, End & Middle locations of the array.
This algorithm finds the location of LOC or sets LOC=NULL if the search is unsuccessful.
MID:=INT((BEG+END)/2)
24