Nothing Special   »   [go: up one dir, main page]

Chapter1 Data Structures

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Chapter -1 Introduction to Data Structure

Chapter-1
INTRODUCTION TO DATA STRUTURES

Definition: Data Structure is the way of collecting and organizing the data in such a
way that we can perform operation on these data in an effective way.
Popularly used organized data structures are arrays, stack, queues, graphs, tress and
linked list

Data Structure = Organized data + Performed Operations

Advantages of Data Structure –


1.Data structures allow storing the information on hard disks.
2. An appropriate choice of ADT (Abstract Data Type) makes the program more efficient.
3. Data Structures are necessary for designing efficient algorithms.
4. It provides reusability and abstraction.
5. To generate bug-free results in coding. Then it is utmost Importance to organize the
data.
6.It reduces the coding cost and enhances data accuracy, which is the goal of
organizations
7.On getting exposed to different range of problem solving techniques,.
8. Programmers who are competent in data structures and algorithm can easily perform
tasks related to data processing, automated reasoning.

Some of examples how data structures are implemented


1. Indexing : even more sophisticated data structures such as B-trees are used to
index objects, such as those stored in a database.

2. Storing data: Data structures are used for efficient data persistence, such as
specifying the collection of attributes and corresponding structures used to stores
records in a database management system.

3. Managing resources and services: Core operating System(OS) resources and


services are enabled through the use of data structures such as linked lists for memory
allocation . file directory management and file structures trees, as well as process
scheduling queues.

4. Data exchange: Data structures define the organization of information shared


between applications as such TCP/IP protocol

5.Searching: Indexed creates using binary search trees, B-trees or hash tables speed
the ability to find a specific sought-after item.

Dr. Ayesha Naaz Page 1


Chapter -1 Introduction to Data Structure

Ordering and sorting: Data structures such as binary search tree, also known as an
ordered or sorted tree provide efficient methods of sorting objects, such as character
strings used as tags.

Data types: Data types of a variable specifies the range of values and amount of storage
space that a variable required. The basic data types includes the following:

1. Integer – We use these for storing various whole numbers, such as 5, 8, 67, 2390,
etc.
2. Character – It refers to all ASCII character sets as well as the single alphabets, such
as ‘x’, ‘Y’, etc.
3. Double – These include all large types of numeric values that do not come under
either floating-point data type or integer data type. Visit Double Data Type in C to know
more.
4. Floating-point – These refer to all the real number values or decimal points, such as
40.1, 820.673, 5.9, etc.

What is the Difference Between Data Type and Data Structure


Data Type Data Structures
It tells about the type of data a
It is a collection of data types
variable can accept.
Data Structures implementation is called
Data Types implementation is a form
concrete implementation as their definition is
of abstract implementation where
already defined by the language that what
different languages define it
type of data they are going to store and deal
differently.
with.

Dr. Ayesha Naaz Page 2


Chapter -1 Introduction to Data Structure

Data Type Data Structures


Data Types can only hold a particular Data Structure can have different kinds and
value that is part of the data types of data within one single object
Values can directly be assigned to the
The data is assigned to the data structure
data type variables since data type
object using some set of algorithms and
already represents the type of value
operations like push, pop, and so on.
that can be stored
For Data Types, there is no Time complexity comes into play when
involvement of time complexity since working with data structures as it mainly
only type and nature of data is deals with manipulation and execution of logic
concern over stored data.
Examples: int, float, double Examples: stacks, queues, tree

Need of Data Structure


1. As applications are becoming more complex and the amount of data is increasing day
by day, which may cause problems with processing speed, searching data, handling
multiple requests etc.
2. Data structure provides a way of organizing, managing, and storing data efficiently.
With the help of data structure, the data items can be traversed easily.
3. Data structure provides efficiency, reusability and abstraction.
4. It plays an important role in enhancing the performance of a program because the
main function of the program is to store and retrieve the user’s data as fast as possible.

Basic Terminology of Data Structures


Data − Data are values or set of values.
Data Item − Data item refers to single unit of values.
Group Items − Data items that are divided into sub items are called as Group Items.
Elementary Items − Data items that cannot be divided are called as Elementary Items.
Attribute and Entity − An entity is that which contains certain attributes or
properties, which may be assigned values.
Entity Set − Entities of similar attributes form an entity set.
Field − Field is a single elementary unit of information representing an attribute of an
entity.
Record − Record is a collection of field values of a given entity.
File − File is a collection of records of the entities in a given entity set.

Characteristics of Data Structures Data Structure is the systematic way used to


organise the data. The characteristics of Data Structures are:
Linear or Non-Linear
This characteristic arranges the data in sequential order, such as arrays, graphs etc.
Static and Dynamic

Dr. Ayesha Naaz Page 3


Chapter -1 Introduction to Data Structure

Static data structures have fixed formats and sizes along with memory locations. The
static characteristic shows the compilation of the data.
Time Complexity
The time factor should be very punctual. The running time or the execution time of a
program should be limited. The running time should be as less as possible. The less the
running time, the more accurate the device is.
Correctness
Each data must definitely have an interface. Interface depicts the set of data structures.
Data Structure should be implemented accurately in the interface.
Space Complexity
The Space in the device should be managed carefully. The memory usage should be
used properly. The space should be less occupied, which indicates the proper function
of the device.

Advantages of Data Structures


1. Allows easier processing of data.
2. It allows information stored on disk vary efficiently
3. These are necessary for designing an efficient algorithm
4. We can access data very easily.
5. IT is secure way of storage of data.
6. Graphs models real life problems
7. It allows processing of data on software system.

Disadvantages of Data Structures


1. It is applicable only for advanced users.
2. If any issue occurs it can be solved only by experts
3. Slow access in case of some unexpected data types.

Classification of Data Structure

Dr. Ayesha Naaz Page 4


Chapter -1 Introduction to Data Structure

for n.

Primitive Data structures:


Data structures that are directly operated upon the machine-level instructions are
known as primitive data structures.
The integers, float, character data, pointers are primitive data structures.

Operations on Primitive Data structures:


Create: Create operation is used to create a new data structure. This operation
reserves memory space for the program elements. It may be carried out at compile time
and run time.
Example: int n=15; // memory spaced to be created
Select: This operation is used programmers to access the data within the data
structure. This operation updates or alters data.
Example: cin>>x;
Update: This operation is used to change data of data structures. An assignment
operation is a good example.
Example: int n = 5; //modifies the value of n to store the new value 5 in it.
Destroy: This operation is used to destroy or remove the data structure from the
memory space. In C++ one can destroy data structure by using the function called
delete.

Non-Primitive Data structures:


The Data structures that are derived from the primitive data structures are called Non-
primitive data structure.
These data structures are used to store group of values.
Non-Primitive data structures are classified as arrays, lists and files.
Data Structures under lists are classified as linear and non-linear data structure.

Linear Data structures:

Dr. Ayesha Naaz Page 5


Chapter -1 Introduction to Data Structure

Linear Data structures are kind of data structure that has homogeneous elements.
The data structure in which elements are in a sequence and form a liner series.
Linear data structures are very easy to implement, since the memory of the computer is
also organized in a linear fashion.
Some commonly used linear data structures are Stack, Queue and Linked Lists.

Non-Linear Data structures:


A Non-Linear Data structures is a data structure in which data item is connected
to several other data items.
Non-Linear data structure may exhibit either a hierarchical relationship or parent
child relationship.
The data elements are not arranged in a sequential structure.
The different non-linear data structures are trees and graphs.

Operations on Non-Primitive Data structures:


Traversing: The processing of accessing each element exactly once to perform some
operation is called traversing.
Insertion: The process of adding a new element into the given collection of data
elements is called

Dr. Ayesha Naaz Page 6


Chapter -1 Introduction to Data Structure

Deletion: The process of removing an existing data element from the given
collection of data elements is called deletion.
Searching: The process of finding the location of a data element in the given
collection of data elements is called as searching.
Sorting: The process of arrangement of data elements in ascending or
descending order is called

Merging: The process of combining the elements of two structures to form a


single structure is called merging.

Introduction to Algorithm
The word Algorithm means ” A set of finite rules or instructions to be followed
in calculations or other problem-solving operations ”
Or
” A procedure for solving a mathematical problem in a finite number of steps
that frequently involves recursive operations”.
Algorithm comes from the name of a mathematician Abu Jafar Muhammed
Ibn Musa Khwarizmi.

What are the Characteristics of an Algorithm?

1 Clear and Unambiguous: The algorithm should be unambiguous. Each of


its steps should be clear in all aspects and must lead to only one meaning.
2. Well-Defined Inputs: If an algorithm says to take inputs, it should be
well-defined inputs. It may or may not take input.

Dr. Ayesha Naaz Page 7


Chapter -1 Introduction to Data Structure

3.Well-Defined Outputs: The algorithm must clearly define what output will
be yielded and it should be well-defined as well. It should produce at least 1
output.
4. Finite-ness: The algorithm must be finite, i.e. it should terminate after a
finite time.
5. Feasible: The algorithm must be simple, generic, and practical, such that
it can be executed with the available resources. It must not contain some
future technology or anything.
6.Language Independent: The Algorithm designed must be language-
independent, i.e. it must be just plain instructions that can be implemented
in any language, and yet the output will be the same, as expected.
7.Input: An algorithm has zero or more inputs. Each that contains a
fundamental operator must accept zero or more inputs.
8. Output: An algorithm produces at least one output. Every instruction that
contains a fundamental operator must accept zero or more inputs.
9. Definiteness: All instructions in an algorithm must be unambiguous,
precise, and easy to interpret. By referring to any of the instructions in an
algorithm one can clearly understand what is to be done. Every fundamental
operator in instruction must be defined without any ambiguity.
10.Finiteness: An algorithm must terminate after a finite number of steps in
all test cases. Every instruction which contains a fundamental operator must
be terminated within a finite amount of time. Infinite loops or recursive
functions without base conditions do not possess finiteness.
11.Effectiveness: An algorithm must be developed by using very basic,
simple, and feasible operations so that one can trace it out by using just
paper and pencil.

Algorithm can de defined in three ways


1. Natural Language like English – Instruction must be definite and
effectiveness.
2. Graphical Representation called Flowchart: Works well only of the
algorithm is small and simple
3. Pseudo Code Method: It should typically describe algorithms as program,
which resemble language like C

Analyze an Algorithm
For a standard algorithm to be good, it must be efficient. Hence the efficiency
of an algorithm must be checked and maintained. It can be in two stages:
1. Priori Analysis:

Dr. Ayesha Naaz Page 8


Chapter -1 Introduction to Data Structure

“Priori” means “before”. Hence Priori analysis means checking the algorithm
before its implementation. In this, the algorithm is checked when it is written
in the form of theoretical steps. This Efficiency of an algorithm is measured
by assuming that all other factors, for example, processor speed, are
constant and have no effect on the implementation. This is done usually by
the algorithm designer. This analysis is independent of the type of hardware
and language of the compiler. It gives the approximate answers for the
complexity of the program.
2. Posterior Analysis:
“Posterior” means “after”. Hence Posterior analysis means checking the
algorithm after its implementation. In this, the algorithm is checked by
implementing it in any programming language and executing it. This analysis
helps to get the actual and real analysis report about correctness(for every
possible input/s if it shows/returns correct output or not), space required,
time consumed, etc. That is, it is dependent on the language of the compiler
and the type of hardware used.

Performance analysis of an algorithm is performed by using the following


measures...
1. Space required to complete the task of that algorithm (Space Complexity). It
includes program space and data space
2.Time required to complete the task of that algorithm (Time Complexity)

What is Space complexity?


When we design an algorithm to solve a problem, it needs some computer
memory to complete its execution. For any algorithm, memory is required for
the following purposes...
To store program instructions.
To store constant values.
To store variable values.
And for few other things like funcion calls, jumping statements etc,.

Space complexity of an algorithm can be defined as follows...


Total amount of computer memory required by an algorithm to complete its
execution is called as space complexity of that algorithm.
Generally, when a program is under execution it uses the computer memory for
THREE reasons. They are as follows...
Instruction Space: It is the amount of memory used to store compiled version
of instructions.

Dr. Ayesha Naaz Page 9


Chapter -1 Introduction to Data Structure

Environmental Stack: It is the amount of memory used to store information of


partially executed functions at the time of function call.
Data Space: It is the amount of memory used to store all the variables and
constants.
Example 1
int square(int a)
{
return a*a;
}
In the above piece of code, it requires 2 bytes of memory to store
variable 'a' and another 2 bytes of memory is used for return value.
That means, totally it requires 4 bytes of memory to complete its
execution. And this 4 bytes of memory is fixed for any input value of 'a'.
This space complexity is said to be Constant Space Complexity.

If any algorithm requires a fixed amount of space for all input values then that
space complexity is said to be Constant Space Complexity.
Consider the following piece of code...
Example 2
int sum(int A[ ], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
In the above piece of code it requires
'n*2' bytes of memory to store array variable 'a[ ]'
2 bytes of memory for integer parameter 'n'
4 bytes of memory for local integer variables 'sum' and 'i' (2 bytes each)
2 bytes of memory for return value.

That means, totally it requires '2n+8' bytes of memory to complete its


execution. Here, the total amount of memory required depends on the value of
'n'. As 'n' value increases the space required also increases proportionately.
This type of space complexity is said to be Linear Space Complexity.

Dr. Ayesha Naaz Page 10


Chapter -1 Introduction to Data Structure

If the amount of space required by an algorithm is increased with the increase


of input value, then that space complexity is said to be Linear Space
Complexity.

What is Time complexity?


Every algorithm requires some amount of computer time to execute its
instruction to perform the task. This computer time required is called time
complexity.
The time complexity of an algorithm can be defined as follows...
The time complexity of an algorithm is the total amount of time required by an
algorithm to complete its execution.
enerally, the running time of an algorithm depends upon the following...
Whether it is running on Single processor machine or Multi processor
machine.
Whether it is a 32 bit machine or 64 bit machine.
Read and Write speed of the machine.
The amount of time required by an algorithm to
perform Arithmetic operations, logical operations, return value
and assignment operations etc.,
Input data

What is Asymptotic Notation?


Whenever we want to perform analysis of an algorithm, we need to calculate
the complexity of that algorithm. But when we calculate the complexity of an
algorithm it does not provide the exact amount of resource required. So instead
of taking the exact amount of resource, we represent that complexity in a
general form (Notation) which produces the basic nature of that algorithm. We
use that general form (Notation) for analysis process.
Asymptotic notation of an algorithm is a mathematical representation of its
complexity.
we use THREE types of Asymptotic Notations and those are as follows...
Big - Oh (O)
Big - Omega (Ω)
Big - Theta (Θ)
Big - Oh Notation (O)
Big - Oh notation is used to define the upper bound of an algorithm in terms of
Time Complexity.
That means Big - Oh notation always indicates the maximum time required by
an algorithm for all input values. That means Big - Oh notation describes the

Dr. Ayesha Naaz Page 11


Chapter -1 Introduction to Data Structure

worst case of an algorithm time complexity.


Big - Oh Notation can be defined as follows...
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) <= C g(n) for all n >= n 0, C > 0 and n 0 >= 1. Then we can
represent f(n) as O(g(n)).
f(n) = O(g(n))
we use THREE types of Asymptotic Notations and those are as follows...
Big - Oh (O)
Big - Omega (Ω)
Big - Theta (Θ)
Big - Oh Notation (O)
Big - Oh notation is used to define the upper bound of an algorithm in terms of
Time Complexity.
That means Big - Oh notation always indicates the maximum time required by
an algorithm for all input values. That means Big - Oh notation describes the
worst case of an algorithm time complexity.
Big - Oh Notation can be defined as follows...
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) <= C g(n) for all n >= n 0, C > 0 and n 0 >= 1. Then we can
represent f(n) as O(g(n)).
f(n) = O(g(n))
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Ω(g(n)) then it must satisfy f(n) >= C g(n) for all
values of C > 0 and n0>= 1
f(n) >= C g(n)
⇒3n + 2 >= C n
Above condition is always TRUE for all values of C = 1 and n >= 1.
By using Big - Omega notation we can represent the time complexity as
follows...
3n + 2 = Ω(n)
Big - Theta Notation (Θ)
Big - Theta notation is used to define the average bound of an algorithm in
terms of Time Complexity.
That means Big - Theta notation always indicates the average time required by
an algorithm for all input values. That means Big - Theta notation describes
the average case of an algorithm time complexity.
Big - Theta Notation can be defined as follows...

Dr. Ayesha Naaz Page 12


Chapter -1 Introduction to Data Structure

Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If C1 g(n) <= f(n) <= C2 g(n) for all n >= n0, C1 > 0, C2 > 0 and
n0 >= 1. Then we can represent f(n) as Θ(g(n)).
f(n) = Θ(g(n))
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Θ(g(n)) then it must satisfy C1 g(n) <= f(n) <=
C2 g(n) for all values of C1 > 0, C2 > 0 and n0>= 1
C1 g(n) <= f(n) <= C2 g(n)
⇒C1 n <= 3n + 2 <= C2 n
Above condition is always TRUE for all values of C1 = 1, C2 = 4 and n >= 2.
By using Big - Theta notation we can represent the time compexity as follows...
3n + 2 = Θ(n)
Difference between Posteriori analysis and A Priori analysis:

A Posteriori analysis A priori analysis

Posteriori analysis is a relative


Priori analysis is an absolute analysis.
analysis.

It is dependent on language of It is independent of language of compiler


compiler and type of hardware. and types of hardware.

It will give exact answer. It will give approximate answer.

It uses the asymptotic notations to


It doesn’t use asymptotic notations
represent how much time the algorithm
to represent the time complexity of
will take in order to complete its
an algorithm.
execution.

The time complexity of an


The time complexity of an algorithm
algorithm using a posteriori
using a priori analysis is same for every
analysis differ from system to
system.
system.

Dr. Ayesha Naaz Page 13


Chapter -1 Introduction to Data Structure

A Posteriori analysis A priori analysis

If the time taken by the program is


If the algorithm running faster, credit
less, then the credit will go to
goes to the programmer.
compiler and hardware.

It is done after execution of an It is done before execution of an


algorithm. algorithm.

It is costlier than priori analysis


because of requirement of
It is cheaper than Posteriori Analysis.
software and hardware for
execution.

Maintenance Phase is required to Maintenance Phase is not required to


tune the algorithm. tune the algorithm.

Recursion
Recursion is the process of calling a function itself repeatedly until a particular
condition is met. A function that calls itself directly or indirectly is called a
recursive function and such kind of function calls are called recursive calls.

Fundamentals of Recursion in C
The fundamental of recursion consists of two objects which are essential parts
of any recursive function. These are:
1.Recursion Case
2. Base Condition

1. Recursion Case
The recursion case refers to the recursive call present in the recursive function.
It decides what type of recursion will occur and the problem will be divided into
smaller sub problems.
2. Base Condition
The base condition specifies when the recursion is going to terminate. It is the
condition that determines the exit point of the recursion. We have to be careful
while defining the base condition because the incorrect or incomplete base
condition may lead the recursion to run for infinite times.

Dr. Ayesha Naaz Page 14


Chapter -1 Introduction to Data Structure

Types of Recursion
The recursion can be classified into different types based on what kind of
recursive case is present in the recursion.
Direct Recursion
Head Recursion
Tail Recursion
Tree Recursion
Indirect Recursion

1. Direct Recursion
Direct recursion is the most common type of recursion, where a function calls
itself directly within its own body. The recursive call can occur once or multiple
times within the function due to which we can further classify the direct
recursion

2.Indirect Recursion
Indirect recursion is an interesting form of recursion where a function calls
another function, which eventually calls the first function or any other function
in the chain, leading to a cycle of function calls. In other words, the functions
are mutually recursive. This type of recursion involves multiple functions
collaborating to solve a problem.

Fibonacci Series using Recursive Function


The Fibonacci series is the sequence where each number is the sum of the
previous two numbers of the sequence. The first two numbers of the Fibonacci
series are 0 and 1 and are used to generate the Fibonacci series.

Application of Fibonacci Series:


The Fibonacci sequence is a series of numbers where each term is the sum of
the previous two terms. The sequence has many applications in various fields

Dr. Ayesha Naaz Page 15


Chapter -1 Introduction to Data Structure

such as nature, music, art, architecture, physics, engineering, and computer


science12. Some examples of applications of Fibonacci numbers are132
The golden ratio in music and art
The Fibonacci search technique and the Fibonacci heap data structure in
computer algorithms
The Fibonacci cubes in graphs for interconnecting parallel and distributed
systems
The coding, encryption, and decryption of information
The solution of many combinatorial problems

GCD using Recursive Function


GCD stands for Greatest Common Divisor and is also known as HCF (Highest
Common Factor). The GCD of two numbers is the largest positive integer that
completely divides both numbers without leaving a remainder.

Tower of Hanoi
Towers, or simply pyramid puzzle[3]) is a mathematical game or puzzle
consisting of three rods and a number of disks of various diameters, which can
slide onto any rod. The puzzle begins with the disks stacked on one rod in
order of decreasing size, the smallest at the top, thus approximating a conical
shape.
With 3 disks, the puzzle can be solved in 7 moves. The minimal number of
moves required to solve a Tower of Hanoi puzzle is 2 n − 1, where n is the
number of disks.

Tower of Hanoi, is a mathematical puzzle which consists of three towers (pegs)


and more than one rings is as depicted −

These rings are of different sizes and stacked upon in an ascending order, i.e.
the smaller one sits over the larger one. There are other variations of the puzzle
where the number of disks increase, but the tower count remains the same.

Dr. Ayesha Naaz Page 16


Chapter -1 Introduction to Data Structure

Rules
The mission is to move all the disks to some another tower without violating
the sequence of arrangement. A few rules to be followed for Tower of Hanoi are

Only one disk can be moved among the towers at any given time.
Only the "top" disk can be removed.
No large disk can sit over a small disk.

The steps to follow are −


Step 1 − Move n-1 disks from source to aux
Step 2 − Move nth disk from source to dest
Step 3 − Move n-1 disks from aux to dest
A recursive algorithm for Tower of Hanoi can be driven as follows −
START
Procedure Hanoi(disk, source, dest, aux)

IF disk == 1, THEN
move disk from source to dest
ELSE
Hanoi(disk - 1, source, aux, dest) // Step 1
move disk from source to dest // Step 2
Hanoi(disk - 1, aux, dest, source) // Step 3
END IF

END Procedure
STOP

Difference between Iteration and Recursion


The following table lists the major differences between iteration and
recursion:
Property Recursion Iteration
A set of instructions repeatedly
Definition Function calls itself.
executed.
Application For functions. For loops.
Through base case, where When the termination condition for
Termination
there will be no function call. the iterator ceases to be satisfied.
Used when code size needs to Used when time complexity needs
Usage be small, and time complexity to be balanced against an
is not an issue. expanded code size.

Dr. Ayesha Naaz Page 17


Chapter -1 Introduction to Data Structure

Property Recursion Iteration


Code Size Smaller code size Larger Code Size.
Relatively lower time
Time Very high(generally
complexity(generally polynomial-
Complexity exponential) time complexity.
logarithmic).
Space The space complexity is
Space complexity is lower.
Complexity higher than iterations.
Here the stack is used to
Stack store local variables when the Stack is not used.
function is called.
Execution is slow since it has Normally, it is faster than
Speed the overhead of maintaining recursion as it doesn’t utilize the
and updating the stack. stack.
Recursion uses more memory Iteration uses less memory as
Memory
as compared to iteration. compared to recursion.
Possesses overhead of No overhead as there are no
Overhead
repeated function calls. function calls in iteration.
If the recursive function does
not meet to a termination If the control condition of the
condition or the base case is iteration statement never becomes
not defined or is never false or the control variable does
Infinite
reached then it leads to a not reach the termination value,
Repetition
stack overflow error and there then it will cause infinite loop. On
is a chance that the an the infinite loop, it uses the CPU
system may crash in infinite cycles again and again.
recursion.

Dr. Ayesha Naaz Page 18

You might also like