Unit 1
Unit 1
Unit 1
View Level
View 1 View 2 --------- View n
Logical Level
Physical Level
Stored database
Or
Name Rollno
5. There can be only one value in each row column cell in a table (i.e.,) The value of a single tuple rollno
cannot be 1,2.
Rollno Name
1,2 Anu
Acc. No Balance
Cust id Cust City
Department
Dname Dno Dlocation Mgrname
Employee Project
Eno Ename Bdate Address Pname Pno Plocation
The above figure shows three record types namely Department, Employee and Project. There are two
PCR types. They are (Department, Employee) and (Department, Project). Each occurrence of the
(Department, Employee) PCR types relates one department record to the records of the many employees
who work in that department.
Department: Research
Employee Project
Here, a record type corresponds to a node of the tree and a PCR type corresponds to an edge of the tree.
Virtual parent-child relationships
The hierarchical model has problems with M:N relationships. To solve this problem virtual parent-child
relationship (VPCR) type was introduced.
E Employee P Project
R ppointer
Here project is called the virtual parent of ppointer, ppointer is called the virtual child. It also avoids
redundancy.
Data definition in Hierarchical model:
The HDDL (Hierarchical data definition language) is used for defining the schema.
Data manipulation in Hierarchical model:
The HDML (Hierarchical data manipulation language) is used for inserting, deleting, selecting,
replacing the data.
3.6.Network data model:
The basic data structures used in network model are called records and sets. Data is stored in records. A
record is a collection of data values. Records are classified into record types.
Examples:
Student
Name Rollno Address Dept
Here record type is student. Data items are name, rollno, address, dept. values for these data items are called
records.
1:N Relationship representation:
A 1:N Relationship between two record types are called set type. They are represented by Bachman Diagram as
follows. Each set type definition has
Department
Dname Dno Dlocation
Student
7 Department of IT, PSNA CET
Name Rollno Address Dept
Set instances:
Department CSE ……….
(owner)
Arun ……….
Balaji ……….
Student
Chandru ……….
(member) Dinesh ……….
Elango ……….
Projection(π)
The projection eliminates all attributes of the input relation but those mentioned in the projection list. The
projection method defines a relation that contains a vertical subset of Relation.
This helps to extract the values of specified attributes to eliminates duplicate values. (pi) The symbol used to
choose attributes from a relation. This operation helps you to keep specific columns from a relation and discards
the other columns.
Example of Projection:
Consider the following table
CustomerID CustomerName Status
1 Google Active
2 Amazon Active
3 Apple Inactive
4 Alibaba Active
Here, the projection of CustomerName and status will give
Π CustomerName, Status (Customers)
CustomerName Status
Google Active
Amazon Active
Apple Inactive
Alibaba Active
Union operation (U)
UNION is symbolized by ∪ symbol. It includes all tuples that are in tables A or in B. It also eliminates
duplicate tuples. So, set A UNION set B would be expressed as:
The result <- A ∪ B
For a union operation to be valid, the following conditions must hold -
12 Department of IT, PSNA CET
•R and S must be the same number of attributes.
•Attribute domains need to be compatible.
•Duplicate tuples should be automatically removed.
Example
Consider the following tables.
Table A Table B
1 1 1 1
1 2 1 3
A ∪ B gives
Table A ∪ B
column 1 column 2
1 1
1 2
1 3
column 1 column 2
1 2
Intersection
An intersection is defined by the symbol ∩
A∩B
Defines a relation consisting of a set of all tuple that are in both A and B. However, A and B must be union-
compatible.
Example:
A∩B
Table A ∩ B
column 1 column 2
1 1
column 1 column 2
1 1
1 1
Join Operations
Join operation is essentially a cartesian product followed by a selection criterion.
Join operation denoted by ⋈.
JOIN operation also allows joining variously related tuples from different relations.
Types of JOIN:
Various forms of join operation are:
Inner Joins:
• Theta join
• EQUI join
• Natural join
Outer join:
• Left Outer Join
• Right Outer Join
• Full Outer Join
Inner Join:
In an inner join, only those tuples that satisfy the matching criteria are included, while the rest are excluded.
Let's study various types of Inner Joins:
Theta Join:
The general case of JOIN operation is called a Theta join. It is denoted by symbol θ
Example
A ⋈θ B
Theta join can use any conditions in the selection criteria.
For example:
A ⋈ A.column 2 > B.column 2 (B)
A ⋈ A.column 2 > B.column 2 (B)
column 1 column 2
1 2
EQUI join:
When a theta join uses only equivalence condition, it becomes a equi join.
For example:
A ⋈ A.column 2 = B.column 2 (B)
14 Department of IT, PSNA CET
A ⋈ A.column 2 = B.column 2 (B)
column 1 column 2
1 1
EQUI join is the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMS
have essential performance problems.
NATURAL JOIN (⋈)
Natural join can only be performed if there is a common attribute (column) between the relations. The name and
type of the attribute must be same.
Example
Consider the following two tables
C
Num Square
2 4
3 9
Num Cube
2 8
3 18
C⋈D
C⋈D
2 4 4
3 9 9
OUTER JOIN
In an outer join, along with tuples that satisfy the matching criteria, we also include some or all tuples that do
not match the criteria.
Left Outer Join(A B)
In the left outer join, operation allows keeping all tuple in the left relation. However, if there is no matching
tuple is found in right relation, then the attributes of right relation in the join result are filled with null values.
Num Square
2 4
4 16
Num Cube
2 8
3 18
5 75
A B
A⋈B
2 4 4
3 9 9
4 16 -
Right Outer Join: ( A B)
In the right outer join, operation allows keeping all tuple in the right relation. However, if there is no matching
tuple is found in the left relation, then the attributes of the left relation in the join result are filled with null
values.
A B
A⋈B
2 8 4
3 18 9
5 75 -
Full Outer Join: ( A B)
In a full outer join, all tuples from both relations are included in the result, irrespective of the matching
condition.
A B
A⋈B
2 4 8
4 16 -
5 - 75
Summary
Operation Purpose
Select(σ) The SELECT operation is used for selecting a subset of the tuples according to a given
selection condition
Projection (π) The projection eliminates all attributes of the input relation but those mentioned in the
projection list.
Union UNION is symbolized by symbol. It includes all tuples that are in tables A or in B.
Operation (∪)
Set Difference - Symbol denotes it. The result of A - B, is a relation which includes all tuples that are in A
(-) but not in B.
Intersection (∩) Intersection defines a relation consisting of a set of all tuple that are in both A and B.
Inner Join Inner join, includes only those tuples that satisfy the matching criteria.
Theta Join(θ) The general case of JOIN operation is called a Theta join. It is denoted by symbol θ.
EQUI Join When a theta join uses only equivalence condition, it becomes a equi join.
Natural Join(⋈) Natural join can only be performed if there is a common attribute (column) between the
relations.
Outer Join In an outer join, along with tuples that satisfy the matching criteria.
Left Outer In the left outer join, operation allows keeping all tuple in the left relation.
Join( )
Right Outer join In the right outer join, operation allows keeping all tuple in the right relation.
( )
Full Outer In a full outer join, all tuples from both relations are included in the result irrespective of the
Join( ) matching condition.
Introduction to NoSQL
What is a NoSQL database?
NoSQL, also referred to as “not only SQL”, “non-SQL”, is an approach to database design that enables the
storage and querying of data outside the traditional structures found in relational databases. While it can still
store data found within relational database management systems (RDBMS), it just stores it differently compared
to an RDBMS. The decision to use a relational database versus a non-relational database is largely contextual,
and it varies depending on the use case.
Instead of the typical tabular structure of a relational database, NoSQL databases, house data within one data
structure, such as JSON document. Since this non-relational database design does not require a schema, it offers
rapid scalability to manage large and typically unstructured data sets.
NoSQL is also type of distributed database, which means that information is copied and stored on various
servers, which can be remote or local. This ensures availability and reliability of data. If some of the data goes
offline, the rest of the database can continue to run.
Today, companies need to manage large data volumes at high speeds with the ability to scale up quickly to run
modern web applications in nearly every industry. In this era of growth within cloud, big data, and mobile
and web applications, NoSQL databases provide that speed and scalability, making it a popular choice for their
performance and ease of use.
Key Highlights on SQL vs NoSQL
SQL NoSQL
These databases have fixed or static or predefined schema They have a dynamic schema
Examples: MySQL, PostgreSQL, Oracle, MS-SQL Server, Examples: MongoDB, HBase, Neo4j, Cassandra,
etc etc
NoSQL is a non-relational database that is used to store the data in the nontabular form. NoSQL stands for Not
only SQL. The main types are documents, key-value, wide-column, and graphs.
Types of NoSQL Database:
• Document-based databases
• Key-value stores
• Column-oriented databases
• Graph-based databases
Document-Based Database:
The document-based database is a nonrelational database. Instead of storing the data in rows and columns
(tables), it uses the documents to store the data in the database. A document database stores data in JSON,
BSON, or XML documents.
Documents can be stored and retrieved in a form that is much closer to the data objects used in applications
which means less translation is required to use these data in the applications. In the Document database, the
particular elements can be accessed by using the index value that is assigned for faster querying.
Collections are the group of documents that store documents that have similar contents. Not all the documents
are in any collection as they require a similar schema because document databases have a flexible schema.
Key features of documents database:
• Flexible schema: Documents in the database has a flexible schema. It means the documents in the
database need not be the same schema.
25 Department of IT, PSNA CET
• Faster creation and maintenance: the creation of documents is easy and minimal maintenance is required
once we create the document.
• No foreign keys: There is no dynamic relationship between two documents so documents can be
independent of one another. So, there is no requirement for a foreign key in a document database.
• Open formats: To build a document we use XML, JSON, and others.
Key-Value Stores:
A key-value store is a nonrelational database. The simplest form of a NoSQL database is a key-value store.
Every data element in the database is stored in key-value pairs. The data can be retrieved by using a unique key
allotted to each element in the database. The values can be simple data types like strings and numbers or
complex objects.
A key-value store is like a relational database with only two columns which is the key and the value.
Key features of the key-value store:
• Simplicity.
• Scalability.
• Speed.
Column Oriented Databases:
A column-oriented database is a non-relational database that stores the data in columns instead of rows. That
means when we want to run analytics on a small number of columns, you can read those columns directly
without consuming memory with the unwanted data.
Columnar databases are designed to read data more efficiently and retrieve the data with greater speed. A
columnar database is used to store a large amount of data. Key features of columnar oriented database:
• Scalability.
• Compression.
• Very responsive.
Graph-Based databases:
Graph-based databases focus on the relationship between the elements. It stores the data in the form of nodes in
the database. The connections between the nodes are called links or relationships.
Key features of graph database:
In a graph-based database, it is easy to identify the relationship between the data by using the links.
The Query’s output is real-time results.
The speed depends upon the number of relationships among the database elements.
Updating data is also easy, as adding a new node or edge to a graph database is a straightforward task that does
not require significant schema changes.