Unit 1
Unit 1
Unit 1
UNIT-1
Basic Terminology used in Data Structures:
Data: Data refers to a value or set of values. Data item refers to a single or group of values
within the data. e.g. Marks obtained by the students.
Entity: An entity is a thing that has some properties which can take values.
Field: A field is part of a record and contains a single piece of data for the subject of the
record
Record: A record is a collection of fields, possibly of different data types, typically in fixed
number and sequence.
File:A file is a collection of data stored on mass storage
Data Organization:
Each record in a file may contain many field items, but the value in a certain field may uniquely
determine the record in the file. Such a field K is called a Primary Key, and the values k1,k2…. In
such a field are called keys or key values.
Data Structure is a way to organize the data in some way so that we can do certain operations(
insert, delete, search, sort) on these data in an effective way.
1.It must reflect the actual relationship of data in the real world.
Ex:-Array , string
Heterogeneous D.S :- All the data elements may not be of same type.
Linear D.S :- Data items are arranged in a Linear sequence & forms a linear ordering in memory.
so data can be processed one by one sequentially.
Non linear D.S:- Data items are arranged in a scattered manner or not in sequence.
Static D.S.:- In these sizes and structures associated with memory location are fixed at compile
time.
Dynamic D.S. :- Which expand or shrink as required during program execution and their
associated memory locations changes.
These are basic data structures & are directly operated upon by the machine instructions.
Ex:- integers, characters, floats, pointers , character constant, string constant etc.
1.integer:-These are whole numbers without fractional part. Integer takes 2 bytes in memory. it may be
signed , unsigned, long
3.character:-We can store any character, its size is 1 byte. It is represented in computer memory by their
ASCII value. The ASCII value of A is 65.but space is 32.
Ex:- x , r , A , z, 4,*, ( ,+
4.Pointer:- Pointer is a variable which store the address . Pointers are used to access and store data in
memory.
float * p;
char *p;
These are more sophisticated data structures and derived from primitive data structures.
They are composed of same or different data structures.so they may be Homogeneous or
Heterogeneous.
2.Stacks:- It is a Linear list in which insertion and deletions can take place only at one end called Top. so
it is called LIFO kind of Data Structure. It is similar to stack of dishes. Operations like push, pop is done in
Stacks.
Insertion of element into Stack is called Push and deletion operation is called Pop.
3.Queue:- It is a linear list where deletions can take place only at one end of the list called FRONT and
insertion can take place only at other end called REAR.
It is a FIFO kind of data structure. It operates just like people waiting at bus stop.
The tree contains elements as its node. Each node can be deleted or added to it.
Tree is a Nonlinear data structure where data items are stored in sorted Sequence.
A node is present at the Top of the Tree is called ROOT. The remaining nodes are partitioned into no. of
Sub Trees.
5.LINKED LIST
A link list is a linear collection of data elements called nodes where the linear order is maintained by
pointers.
The nodes are non sequential . Each node is implemented in computer by a self referential STRUCTURE.
• 2nd part called link field or next pointer field contains the address of next node (LINK)
6.Graph:-A graph G is an Ordered set(V,E) where V represents the set of elements, called Vertices and E
represents the Edge between these elements.
An edge connects a pair of vertices. vertices on graph are shown as circles. so an edge can be
represented as E=(V,W) Graphs may be different types :-
4.Traversal-Accessing each element exactly once .This operation is called Visiting the elements.
6.Merging-Combining the elements of two similar sorted list of data into a single list.
SPACE COMPLEXITY
Space complexity is a measure of the amount of working storage an algorithm needs. That means how
much memory, in the worst case, is needed at any point in the algorithm. As with time complexity, we're
mostly concerned with how the space needs grow, in big-Oh terms, as the size N of the input problem
grows.
requires 3 units of space for the parameters and 1 for the local variable, and this never changes, so this
is O(1).
If a function A uses M units of its own space (local variables and parameters), and it calls a function B
that needs N units of local space, then A overall needs M + N units of temporary workspace. What if A
calls B 3 times? When a function finishes, its space can be reused, so if A calls B 3 times, it still only
needs M + N units of workspace.
What if A calls itself recursively N times? Then its space can't be reused because every call is still in
progress, so it needs O(N2) units of workspace.
Time Complexity
Time Complexity of an algorithm is the representation of the amount of time required by the algorithm
to execute to completion. Time requirements can be denoted or defined as a numerical function t(N),
where t(N) can be measured as the number of steps, provided each step takes constant time.
Big O Notation expresses the run time of an algorithm in terms of how quickly it grows relative to the
input ‘n’ by defining the N number of operations that are done on it. Thus, the time complexity of an
algorithm is denoted by the combination of all O[n] assigned for each line of function.
Time complexity of an algorithm signifies the total time required by the program to run till its
completion.
The time complexity of algorithms is most commonly expressed using the big O notation. It's an
asymptotic notation to represent the time complexity. We will study about it in detail in the next
tutorial.
Time Complexity is most commonly estimated by counting the number of elementary steps performed
by any algorithm to finish execution. Like in the example above, for the first code the loop will
run n number of times, so the time complexity will be n at least and as the value of n will increase the
time taken will also increase. While for the second code, time complexity is constant, because it will
never be dependent on the value of n, it will always give the result in 1 step.
Order of growth is how the time of execution depends on the length of the input. In the above example,
we can clearly see that the time of execution is linearly depends on the length of the array. Order of
growth will help us to compute the running time with ease. We will ignore the lower order terms, since
the lower order terms are relatively insignificant for large input. We use different notation to describe
limiting behavior of a function.
O-notation:
To denote asymptotic upper bound, we use O-notation. For a given function g(n), we denote
by O(g(n)) (pronounced “big-oh of g of n”) the set of functions:
O(g(n))= { f(n) : there exist positive constants c and n0 such that 0≤f(n)≤c∗g(n) for all n≥n0 }
Ω-notation:
To denote asymptotic lower bound, we use Ω-notation. For a given function g(n), we denote
by Ω(g(n)) (pronounced “big-omega of g of n”) the set of functions:
Ω(g(n))= { f(n) : there exist positive constants c and n0 such that 0≤c∗g(n)≤f(n) for all n≥n0 }
Θ-notation:
To denote asymptotic tight bound, we use Θ-notation. For a given function g(n), we denote
by Θ(g(n)) (pronounced “big-theta of g of n”) the set of functions:
Θ(g(n))= { f(n) : there exist positive constants c1,c2 and n0 such that 0≤c1∗g(n)≤f(n)≤c2∗g(n) for
all n>n0 }
The concept of dynamic memory allocation in c language enables the C programmer to allocate
memory at runtime. Dynamic memory allocation in c language is possible by 4 functions of
stdlib.h/alloc.h header file.
1. malloc()
2. calloc()
3. realloc()
4. free()
1. malloc( )
The malloc() function allocates single block of requested memory.
ptr=(cast-type*)malloc(byte-size)
program :-
2.calloc( )
3.realloc( )
If memory is not sufficient for malloc() or calloc(), you can reallocate the memory by realloc() function.
In short, it changes the memory size.Let's see the syntax of realloc() function.
4.free( )
The memory occupied by malloc() or calloc() functions must be released by calling free() function.
Otherwise, it will consume memory until program exit.Let's see the syntax of free() function.
free(pointer name) ;
LINKED LIST
It is a linear collection of data elements called Nodes, that are stored in different memory locations
connected by pointers.
Advantages
Disadvantages
4. Searching is Slow.
1. Linear linked list or One way linked list or Single linked list
4.Searching of an Element.
5.Sorting of elements.
It is a One way collection of Nodes where the linear order is maintained by pointers. The nodes are
not in sequence. Each node is implemented in computer by a self referential STRUCTURE.
• 2nd part is link field contains the address of next node of the list (LINK)
1.An External pointer START contains the address of the 1st node. So an arrow drawn from START to
the 1st node of the list.
2.Last Node contains NULL . It is a NULL pointer indicates End of the list.
struct node
int info ;
};
These are called Self Referential structures because a structure contains pointers to structure of its own
type.
The new node is created & the address of the new node is assigned to start as :-
Single linked list where link field of last node contains address of the first node .
We can assign two External Pointer START AND LAST to Circular linked list OR we can assign
ONLY One External Pointer START to Circular Linked List.
We can declare the structure for the Circular linked list as:-
Struct Node {
int info ;
};
Double linked list is a linear collection of nodes where each node has two link fields :-
The previous and Next Pointers are used to point the successor node and Predecessor node
respectively.
Struct node
int info ;
};
The last node contains the address of the first node in Next part instead of NULL & first node
contains address of the last node in the PREV. part.
We can assign two pointers START and LAST or one start pointer to the linked list
Struct node {
int info ;
Questions:-
write the difference between single and single circular or circular linked list ?
Write the difference between double and circular double linked list?
SPARSE MATRIX
A matrix can be defined as a two-dimensional array having 'm' columns and 'n' rows representing m*n
matrix. Sparse matrices are those matrices that have the majority of their elements equal to zero. In other
words, the sparse matrix can be defined as the matrix that has a greater number of zero elements than
the non-zero elements.
We can also use the simple matrix to store the elements in the memory; then why do we need to use the
sparse matrix. The following are the advantages of using a sparse matrix:
o Storage: As we know, a sparse matrix that contains lesser non-zero elements than zero so less
memory can be used to store elements. It evaluates only the non-zero elements.
o Computing time: In the case of searching n sparse matrix, we need to traverse only the non-zero
elements rather than traversing all the sparse matrix elements. It saves computing time by
logically designing a data structure traversing non-zero elements.
Representing a sparse matrix by a 2D array leads to the wastage of lots of memory. The zeroes in the
matrix are of no use to store zeroes with non-zero elements. To avoid such wastage, we can store only
non-zero elements. If we store only non-zero elements, it reduces the traversal time and the storage
space.
The non-zero elements can be stored with triples, i.e., rows, columns, and value. The sparse matrix can be
represented in the following ways:
o Array representation
o Linked list representation
Array Representation
The 2d array can be used to represent a sparse matrix in which there are three rows named as:
Let's understand the sparse matrix using array representation through an example.
As we can observe above, that sparse matrix is represented using triplets, i.e., row, column, and value. In
the above sparse matrix, there are 13 zero elements and 7 non-zero elements. This sparse matrix occupies
5*4 = 20 memory space. If the size of the sparse matrix is increased, then the wastage of memory space
will also be increased. The above sparse matrix can be represented in the tabular form shown as below:
PROGRAM ON ARRAY IMPLEMENTATION OF SPARSE MATRIX IN C:
In linked list representation, linked list data structure is used to represent a sparse matrix. In linked list
representation, each node consists of four fields whereas, in array representation, there are three fields,
i.e., row, column, and value. The following are the fields in the linked list:
Let's understand the sparse matrix using linked list representation through an example.
In the above figure, sparse represented in the linked list form. In the node, first field represents the index
of row, second field represents the index of column, third field represents the value and fourth field
contains the address of the next node.
*************