Module 1 Final

Introduction to DBMS
MODULE 1
DATABASE MANAGEMENT SYSTEM
MODULE 1
Databases and Database Database System Concepts Conceptual Data Modelling

Users and Architecture using Entities and
1. Data Models, Schemas, Relationships
1. Introduction and Instances 1. Using High-Level
2. An Example 2. Three-Schema Conceptual Data Models
3. Characteristics of the Architecture and Data for Database Design
Database Approach Independence 2. A Sample Database
4. Actors on the Scene 3. Database Languages and Application
5. Workers behind the Interfaces 3. Entity Types, Entity Sets,
Scene 4. The Database System Attributes, and Keys
6. Advantages of Using the Environment 4. Relationship Types,
DBMS Approach 5. Centralized and Relationship Sets, Roles,
7. A Brief History of Client/Server and Structural
Database Applications Architectures for DBMSs Constraints
8. When Not to Use a 6. Classification of 5. Weak Entity Types
DBMS Database Management 6. Refining the ER Design
Systems for the COMPANY
Database
7. ER Diagrams, Naming
Conventions, and Design
Issues
8. Example
Sudarsanan D Assistant Professor, CITECH-ISE. Page 1

1.1 INTRODUCTION
“Good decisions require good information that is derived from raw facts”
These raw facts are known as data. Data are likely to be managed most
efficiently when they are stored in a database
What is Data?
Data means known raw facts that can be recorded and that have implicit meaning.
For example:
consider the names, telephone numbers, and addresses of the people you know.
Note: The word raw indicates that the facts have not yet been processed to
reveal their meaning.
What is Information?
Information is the result of processing raw data to reveal its meaning. Data processing
can be as simple as organizing data to reveal patterns or as complex as making forecasts or
drawing inferences using statistical modeling.
Why Databases:
Imagine trying to operate a business without knowing who your customers are,
what products you are selling, who is working for you, who owes you money, and whom
you owe money. All businesses have to keep this type of data and much more; and just as
importantly, they must have those data available to decision makers when they need them.
It can be argued that the ultimate purpose of all business information systems is to help
businesses use information as an organizational resource. At the heart of all of these
systems are the collection, storage, aggregation, manipulation, dissemination, and
management of data.
What is a Database? Explain.

A database is a collection of related data. It is collection of large volumes of facts
and figures in an orderly manner. A database has the following implicit meaning
i. A database represents aspects of the real world
ii. It is a logically coherent collection of data with some inherent meaning
iii. A database is designed, built and populated with data for a specific purpose.
It has an intended group of users.

What is Database Management System (DBMS)?

or
What are the four main types of actions involved in database? Briefly
discuss each.
or
What does defining, constructing, manipulating and sharing of database
mean?
A database management system (DBMS) is a collection of programs that enables users to
create and maintain a database. The DBMS is a general-purpose software system that
facilitates the processes of defining, constructing, manipulating, and sharing databases
among various users and applications.
Defining : A database involves specifying the data types , structures, constraints of the data
to be stored in the database.
Constructing: the database is the process of storing the data on some storage medium that
is controlled by the DBMS.
Manipulating: A database includes functions such as querying the database to retrieve
specific data, updating the database to reflect changes in the mini-world and generating
reports from the data.
Sharing: a database allows multiple users and programs to access the database
simultaneously.
1.2 DATABASE-SYSTEM APPLICATIONS (AN EXAMPLE)
Databases are widely used. Here are some representative applications:

• Enterprise Information
 Sales: For customer, product, and purchase information.
 Accounting: For payments, receipts, account balances, assets and other
accounting information.
 Human resources: For information about employees, salaries, payroll taxes,
and benefits, and for generation of paychecks.
 Manufacturing: For management of the supply chain and for tracking
production of items in factories, inventories of items in warehouses and
stores, and orders for items.
 Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations.
 Banking and Finance
 Banking: For customer information, accounts, loans, and banking
transactions.
 Credit card transactions: For purchases on credit cards and generation of
monthly statements.
 Finance: For storing information about holdings, sales, and purchases of
financial instruments such as stocks and bonds; also for storing real-time
market data to enable online trading by customers and automated trading
by the firm
 Universities: For student information, course registrations, and grades (in
addition to standard enterprise information such as human resources and
accounting).
 Airlines: For reservations and schedule information. Airlines were among the
first to use databases in a geographically distributed manner.
 Telecommunication: For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information
about the communication networks
1.3 CHARACTERISTICS OF THE DATABASE APPROACH

Discuss the main characteristics of database approach and how it differ from
traditional file system?
A number of characteristic distinguish the database approach from the traditional

approach of programming with files such as
1) In traditional file processing, each user defines and implements the files needed for
his specific application. Each user maintains separate files which promotes
redundancy, wastage of valuable memory space and inconsistency.
The database approach on the other hand maintains a single repository of data
which can be accessed by various users. Hence it avoids redundancy and
inconsistency.
SIMPLE FILE SYSTEM

CONTRASTING DATABASE AND FILE SYSTEMS
2) Self-describing nature of the database system

The traditional file processing system does not contain the description of
itself. However, the database approach not only stores the database but also
stores a complete description of the database structure and constraint in a
“catalog”. The information stored in the catalog is referred to as the “meta
data”.

3) Insulation between Programs and Data,and Data Abstraction
In traditional file processing approach, data definition is a part of the application

program. Hence programs would be able to work with only one specific database.
However in the database approach, data definition is stored in the DBMS catalog
separately from access program. This property is called as “program-data
independence” further application programs can operate on the data by invoking
operations(functions) regardless of how these operations are implemented. This is
termed as “program –operation independence”
This characteristic of the database that allows program-data independence and

program –operation independence is called as data abstraction.
4) Support of Multiple Views of the Data
A traditional file processing approach supports a single view of the data. However a
database approach supports multiple view of the data. Database approach supports
many users each of whom would require a certain view of the database. Hence
DBMS approach provides facilities for defining multiple views.
5) Sharing of data and multi-user transaction
Traditional file processing approach did not support sharing of data. However, the
modern database approach supports sharing of data as well as multi-user
transactions. For this, the DBMS includes features such as concurrency control to
ensure that several users trying to update the same data do so in a controlled
manner. It also enforces isolation property , atomicity property etc,. to promote
this.
1.4 ACTORS ON THE SCENE

The people whose jobs involve the day-to-day use of a large database; we call them the
actors on the scene.
1) Database Administrators (DBA)
What are the responsibilities of a DBA?
In any organization where many persons use the same resources, there is a need for
a chief administrator to oversee and manage these resources. In a database
environment, the primary resource is the database itself and the secondary resource
is the DBMS and related software. Administering these resources is the
responsibility of the database administrator (DBA). The DBA is responsible for

authorizing access to the database, for coordinating and monitoring its use, and for
acquiring software and hardware resources as needed
2) Database Designers
What are the responsibilities of the database designer?
i. Database designers are responsible for identifying the data to be stored in

the database and for choosing appropriate structures to represent and store
this data.
ii. He is his responsibility to communicate with all prospective database users

in order to understand their requirements and to create a design that meets
these requirements
iii. Database designers typically interact with each potential group of users and
develop “views” of the database as per the requirements of these groups.
iv. It is his responsibility to create such a database design which favors the
requirements of all user groups.
3) End Users
What is the different type of database end-users? Discuss the main activities of
each.
End users are the people whose jobs require access to the database for querying,
updating, and generating reports; the database primarily exists for their use. There
are several categories of end users:
i. Casual end users: are such users who may need different information each
time they query the database. These users include managers, occasional
browser.
ii. Naive or parametric end users: make up the major portion of database
users they constantly query the database using standard types of queries and
updates called as “canned transactions”
Bank tellers check account balances and post withdrawals and deposits
iii. Sophisticated end users: includes engineers, scientists, business analysts

and others who thoroughly familiarize themselves with the DBMS to
implement applications (Programs) that meet their complex requirement.
iv. Stand-alone users: maintain personal databases by using ready-made

program packages that provide easy-to use menu- or graphics-based
interfaces. An example is the user of a tax package that stores a variety of

personal financial data for tax purposes.
4) System Analysts and Application Programmers (Software Engineers)
System analysts determine the requirements of end users, especially naive and
parametric end users, and develop specifications for canned transactions that meet
these requirements. Application programmers implement these specifications as
programs; then they test, debug, document, and maintain these canned transactions.
Such analysts and programmers (nowadays called software engineers) should be
familiar with the full range of capabilities provided by the DBMS to accomplish their
tasks.
1.5 WORKERS BEHIND THE SCENE

Who are the workers behind the scene in a DBMS software and system
environment?
Those who work to maintain the database system environment, but who are not
actively interested in the database contents as part of their daily job
i. DBMS system designers: who design modules such as the module for
implementing catalog, modules for controlling concurrency , handling data
recovery and security etc.
ii. Tool developers: who design tools(software packages). Tools are optional
packages that are often purchased separately.
iii. Operators and maintenance personnel : are responsible for actual running
and maintenance of hardware and software system.
1.6 ADVANTAGES OF USING THE DBMS APPROACH
What are the advantages of using a DBMS approach? (or) Discuss the capabilities
that must be provided by a DBMS.
i. Controlling Redundancy in data storage This redundancy in storing the same

data multiple times leads to several problems. First, there is the need to perform
a single logical update—such as entering data on a new student—multiple times:
This leads to duplication of effort. Second, storage space is wasted when the same
data is stored repeatedly, and this problem may be serious for large databases.

Files that represent the same data may become inconsistent. This may happen
because an update is applied to some of the files but not to others.
ii. Restricting unauthorized access to data. When multiple users share a large
database, it is likely that most users will not be authorized to access all
information in the database. for example only authorized persons are allowed to
access the data. In addition, some users may only be permitted to retrieve data,
whereas others are allowed to retrieve and update. A DBMS should provide a
security and authorization subsystem.
iii. Providing Persistent Storage for Program Objects Databases can be used to
provide persistent storage for program objects and data structures. The values
of program variables or objects are discarded once a program terminates, unless
the programmer explicitly stores them in permanent files, which often involves
converting these complex structures into a format suitable for file storage.
iv. The persistent storage of program objects and data structures is an important
function of database systems. Traditional database systems often suffered from the
so called impedance mismatch problem
v. Providing Storage Structures and Search Techniques for Efficient Query

Processing Database systems must provide capabilities for efficiently executing
queries and updates. Because the database is typically stored on disk, the DBMS
must provide specialized data structures and search techniques to speed up disk
search for the desired records. Auxiliary files called indexes are used for this
purpose.
vi. Providing Backup and Recovery A DBMS must provide facilities for recovering
from hardware or software failures. The backup and recovery subsystem of
the DBMS is responsible for recovery.
vii. For example, if the computer system fails in the middle of a complex update
transaction, the recovery subsystem is responsible for making sure that the
database is restored to the state it was in before the transaction started executing.
viii. Providing Multiple User Interfaces Because many types of users with varying
levels of technical knowledge use a database, a DBMS should provide a variety of
user interfaces. forms-style interfaces and menu-driven interfaces are used and
commonly known as graphical user interfaces (GUIs). Many specialized
languages and environments exist for specifying GUIs.
ix. Representing Complex Relationships among Data A database may include

numerous varieties of data that are interrelated in many ways. A DBMS must

have the capability to represent a variety of complex relationships among the

data, to define new relationships as they arise, and to retrieve and update
related data easily and efficiently.
x. Enforcing Integrity Constraints Most database applications have certain

integrity constraints that must hold for the data. A DBMS should provide
capabilities for defining and enforcing these constraints. The simplest type of
integrity constraint involves specifying a data type for each data item.
1.7 A BRIEF HISTORY OF DATABASE APPLICATIONS
 Early Database Applications:

 The Hierarchical and Network Models were introduced in mid 1960s and
dominated during the seventies.
A hierarchical database model is a data model in which the data are
organized into a tree-like structure. The data are stored as records which are
connected to one another through links.
(Hierarchical database model)
Disadvantages
 When a user needs to store a record in a child table that is currently unrelated to any
record in a parent table, it gets difficulty in recording and user must record an additional
entry in the parent table.

 This type of database cannot support complex relationships, and there is also a problem
of redundancy, which can result in producing inaccurate information due to the
inconsistent recording of data at various sites.
Network model (popular in mainframe computer)
Disadvantages
The disadvantages of network model are as follows:

 Database contains a complex array of pointers.
 System complexity limits efficiency.
 Structural changes require changes in all application programs.
 Navigation systems yield complex implementation and management.
 Keep heavy pressure on programmers due to the complex structure.
 Any change like updating, deletion, insertion is very complex.
Relational Model based Systems:

 Relational model was originally introduced in 1970, was heavily researched
and experimented with in IBM Research and several universities
 Object-oriented and emerging applications:
Object-Oriented Database Management Systems (OODBMSs) were
introduced in late 1980s and early 1990s to cater to the need of complex data
processing in CAD and other applications.
 Their use has not taken off much.
Many relational DBMSs have incorporated object database concepts, leading to a
new category called object-relational DBMSs (ORDBMSs)

Extended relational systems add further capabilities (e.g. for multimedia data, XML,
and other data types)
Relational DBMS Products emerged in the 1980s
 Data on the Web and E-commerce Applications:
 Web contains data in HTML (Hypertext markup language) with links among pages.
 This has given rise to a new set of applications and E-commerce is using new
standards like XML (eXtended Markup Language).
 Script programming languages such as PHP and JavaScript allow generation of
dynamic Web pages that are partially generated from a database
 New functionality is being added to DBMSs in the following areas:
 Scientific Applications
 XML (eXtensible Markup Language)
 Image Storage and Management
 Audio and Video data management
 Data Warehousing and Data Mining
 Spatial data management
 Time Series and Historical Data Management
 The above gives rise to new research and development in incorporating new data
types, complex data structures, new operations and storage and indexing schemes
in database systems.
 Also allow database updates through Web pages
1.8 DATA MODELS, SCHEMAS, AND INSTANCES
A data model—A data model is a relatively simple representation, usually graphical, of

more complex real-world data structures.
(or)
A data model represents data structures and their characteristics,

relations, constraints, transformations, and other constructs with the purpose of

supporting a specific problem domain.
Categories of Data Models

Discuss the main categories of data models.
Many data models have been proposed, which we can categorize according to the
types of concepts they use to describe the database structure.
High-level or conceptual data models provide concepts that are close to the way
many users perceive data,
Low-level or physical data models provide concepts that describe the details of
how data is stored on the computer storage.
These two extremes is a class of representational (or implementation) data
models, which provide concepts that may be easily understood by end users.
Database Schema and Database State

What is the difference between a database schema and a database state ?
The description of a database is called the “database schema”. It is specified during
the database design phase and in not expected to change frequently. A pictorial
representation of the database schema is called as the schema diagram.
Schema Diagram: An illustrative display of (most aspects of) a database schema.
Schema Construct: A component of the schema or an object within the schema, e.g.,
STUDENT, COURSE.
Database State: The actual data present in the database at any particular point of time is
called as a database state (or snapshot or occurrences or instances). The database
state(actual data) may change from time to time frequently.
Database State: Refers to the content of a database at a moment in time.

Initial Database State: Refers to the database state when it is initially populated with data
into the system.
Valid State: A state that satisfies the structure and constraints of the database.
The database schema is sometimes called as the “intension” and a database state is called
an “extension” of the schema.
1.9 THREE-SCHEMA ARCHITECTURE AND DATA INDEPENDENCE

The goal of the three-schema architecture, illustrated in Figure is to separate the user
applications from the physical database.
Create table EMP

(
Emp_No int(15) primary key,
First_Name varchar(20),
Last_Name varchar(20),
Dept.num varchar(10)
)
1. The external or view level includes a number of external schemas or user views.
Each external schema describes the part of the database that a particular user group
is interested in and hides the rest of the database from that user group.

2. The conceptual level has a conceptual schema, which describes the structure of
the whole database for a community of users. The conceptual schema hides the
details of physical storage structures and concentrates on describing entities, data
types, relationships, user operations, and constraints.
3. The internal level has an internal schema, which describes the physical storage
structure of the database. The internal schema uses a physical data model and
describes the complete details of data storage and access paths for the database.
1.10 DATA INDEPENDENCE
What is the difference between logical data independence and physical data
independence which one is harder to achieve? Why?
Three-schema architecture can be used to achieve both logical data independence and
physical data independence.
1. Logical data independence
2. Physical data independence
1. Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or application programs. We may change the conceptual
schema to expand the database (by adding a record type or data item), to change
constraints, or to reduce the database (by removing a record type or data item).
2. Physical data independence is the capacity to change the internal schema without
having to change the conceptual schema. Hence, the external schemas need not be changed
as well. Changes to the internal schema may be needed because some physical files were
reorganized -for example, by creating additional access structures—to improve the
performance of retrieval or update. If the same data as before remains in the database, we
should not have to change the conceptual schema.
1.11 DATABASE LANGUAGES AND INTERFACES

Write a note on the different DBMS languages.
The DBMS must provide appropriate languages and interfaces for each Category of users.
DBMS Languages
• Data Definition Language (DDL):
• Storage Definition Language (SDL)
• View Definition Language (VDL)
• Data Manipulation Language (DML)

Data definition language (DDL): Used by the DBA and database designers to specify the
conceptual schema of a database. The DBMS will have a DDL compiler whose function is to
process DDL statements in order to identify descriptions of the schema constructs and to
store the schema description in the DBMS catalog.
Storage definition language (SDL), is used to specify the internal schema. The mappings
between the two schemas may be specified in either one of these languages.
View definition language (VDL),to specify user views and their mappings to the
conceptual schema, but in most DBMSs the DDL is used to define both conceptual and
external schemas.
Data Manipulation Language (DML), Used to specify database retrievals and
updates.DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C, C++, or Java.
DBMS Interfaces:
Discuss the different types of user friendly interfaces and the types of users who
typically use each.
Many user friendly interfaces are provided by the DBMS to enable the user to interact with
the data in the database such as
Menu-Based Interfaces: These interfaces present the user with lists of options (called
menus) that help the user to make a request. The advantage of this is that the user need
not memorize the specific commands and syntax.
Forms-Based Interfaces: A forms-based interface displays a form to each user. Users can
fill out all of the form entries to insert new data onto the database. Forms are usually
designed and programmed for naive users.
Graphical User Interface: Present a pictorial form of the schema. The user can then use a
pointing device(such as a mouse) to make a choice out of the many options provided by the
GUI.
Natural Language Interfaces: These interfaces accept requests written in English or some
other language and attempt to understand them. A natural language interface would have a
dictionary of important words. If the interpretation is successful, it generate a high level
query. Otherwise, a dialogue is started with the user to clarify the request.
Speech Input and Output: Limited use of speech as an input query and speech as an
answer to a question or result of a request is becoming commonplace. Applications with
limited vocabularies such as inquiries for telephone directory, flight arrival/departure, and
credit card account information are allowing speech for input and output to enable
customers to access this information.

Interfaces for Parametric Users: Parametric users, such as bank tellers, often have a
small set of operations that they must perform repeatedly.
Interfaces for the DBA: Most database systems contain privileged commands that can be
used only by the DBA staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the
storage structures of a database.
1.12 THE DATABASE SYSTEM ENVIRONMENT/TYPICAL COMPONENTS

OF DBMS MODULE AND INTERACTIONS
What other computer system software does a DBMS interact with? With a neat
diagram, explain the component modules of a DBMS and their interaction.
DBMS Component Modules:
Fig: Component modules of a DBMS and their interactions

A DBMS is a complex software system. The types of software components that constitute a
DBMS and the types of computer system software with which the DBMS interacts.
The figure is divided into two parts. The top part of the figure refers to the various users of
the database environment and their interfaces. The lower part shows the internals of the
DBMS responsible for storage of data and processing of transactions. The database and the
DBMS catalog are usually stored on disk. Access to the disk is controlled primarily by the
operating system (OS).
Many DBMSs have their own buffer management module to schedule disk
Read/write, because this has a considerable effect on performance. top part of Figure
shows interfaces for the DBA staff, casual users who work with interactive interfaces to
formulate queries, application programmers who create programs using some host
programming languages, and parametric users who do data entry work by supplying
parameters to predefined transactions.
The DBA staff works on defining the database and tuning it by making changes to its
definition using the DDL and other privileged commands. The queries are parsed and
validated for correctness of the query syntax, the names of files and data elements, and so
on by a query compiler that compiles them into an internal form. the query optimizer is
concerned with the rearrangement and possible reordering of operations, elimination of
redundancies, and use of correct algorithms and indexes during execution. The pre
compiler extracts DML commands from an application program written in a host
programming language. We have shown concurrency control and backup and recovery
systems separately as a module in this figure.
The DBMS interacts with the operating system when disk accesses—to the database or to
the catalog—are needed. If the computer system is shared by many users, the OS will
schedule DBMS disk access requests and DBMS processing along with other processes. On
the other hand, if the computer system is mainly dedicated to running the database server,
the DBMS will control main memory buffering of disk pages.
Database System Utilities:

What are database utilities? List a few common functions that the utilities perform.
Database utilities refer to additional facilities that help the DBA to manage the database
system. Some of the common utilities are-
i. Loading: . A loading utility is used to load existing data files—such as text files or
sequential files—into the database. Usually, the current (source) format of the data
file and the desired (target) database file structure are specified to the utility, which
then automatically reformats the data and stores it in the database.
ii. Backup: A backup utility creates a backup copy of the database, usually by dumping
the entire database onto tape or other mass storage medium. The backup copy can
be used to restore the database in case of catastrophic disk failure.
iii. Database storage re-organization: This utility can be used to reorganize a set of
database files into different file organizations and create new access paths to
improve performance.
iv. Performance monitoring: database usage and provides statistics to the DBA. The
DBA uses the statistics in making decisions such as whether or not to reorganize
files or whether to add or drop indexes to improve performance
Tools, Application Environments, and Communications Facilities

Tools: Other tools are often available to database designers, users, and the DBMS. CASE
tools12 are used in the design phase of database systems. Another tool that can be quite
useful in large organizations is an expanded data dictionary (or data repository) system.
Application such as PowerBuilder (Sybase) or JBuilder (Borland), xamp have been quite
popular. These systems provide an environment for developing database applications and
include facilities that help in many facets of database systems, including database design,
GUI development, querying and updating, and application program development.
communications software: The DBMS also needs to interface with communications

software, whose function is to allow users at locations remote from the database system
site to access the database through computer terminals, workstations, or personal
computers. These are connected to the database site through data communications
hardware such as Internet routers, phone lines, long-haul networks, local networks, or
satellite communication devices. Many commercial database systems have communication
packages that work with the DBMS.
1.13 CENTRALIZED AND CLIENT/SERVER ARCHITECTURES FOR DBMSS

Write a note on the centralized and the client/server architecture for DBMS.
i. Centralized DBMSs Architecture: Older architectures used mainframe

computers to provide the main processing for all system functions, including user
application programs and user interface programs, as well as all the DBMS
functionality. The reason was that in older systems, most users accessed the DBMS
via computer terminals that did not have processing power and only provided
display capabilities. Therefore, all processing was performed remotely on the
computer system housing the DBMS, and only display information and controls
were sent from the computer to the display terminals, which were connected to the
central computer via various types of communications networks.
A physical centralized architecture
As prices of hardware declined, most users replaced their terminals with PCs and
workstations, and more recently with mobile devices.
Gradually, DBMS systems started to exploit the available processing power at the
user side, which led to client/server DBMS architectures.
ii. Basic Client/Server Architectures: The concept of client/server architecture

assumes an underlying framework that consists of many PCs/workstations and
mobile devices as well as a smaller number of server machines, connected via
wireless networks or LANs and other types of computer networks. A client in this
framework is typically a user machine that pro vides user interface capabilities and
local processing. When a client requires access to additional functionality—such as
database access—that does not exist at the cli ent, it connects to a server that
provides the needed functionality. A server is a sys tem containing both hardware
and software that can provide services to the client machines, such as file access,
printing, archiving, or database access.

Two main types of basic DBMS architectures were created on this underlying
client/server framework: two-tie and three-tier.
Two-Tier Client/Server Architectures for DBMSs:
Explain the operation of a 2-tier client/server architecture. How does it differ from
a 3-tier client/server architecture.
A 2-tier client/server architecture is such an architecture in which the

user-interface program and application programs can run on the client side when
the client requires a database access, it establishes a connection with the DBMS.
Once the connection is created, the client program can communicate with the DBMS.
A standard called as “OPEN DATABASE CONNECTIVITY (ODBC)” provides an
application programming interface(API) which allows client side program to call the
DBMS as long as both client and server machines have necessary software installed
on them. The advantage of this architecture is its simplicity and its compatibility
with the existing systems. However, the emergence of WEB changed the role of
clients and server leading to the 3-tier architecture.
Physical two-tier client/server architecture

Three-Tier and n-Tier Architectures for Web Applications:
Many web-application use a 3-ier architecture which adds an additional layer called
the APPLICATION SERVER or WEB SERVER between the client and database
server. This additional layer stores business rules(ie procedures or constraints) that
are used to access database from the database server.
It can also improve database security by checking a client’s credentials before

forwarding a request to the database server.
Logical three-tier client/server architecture, with a couple of commonly used nomenclatures
1.14 CLASSIFICATION OF DATABASE MANAGEMENT SYSTEMS

What are the different ways of classifying a DBMS? Explain.
There are several criteria’s based upon which the DBMS can be classified
i. Based on the Data Model: The main data model used in many current commercial
DBMSs is the relational data model, and the systems based on this model are
known as SQL systems. The object data model has been implemented in some
commercial systems but has not had widespread use. Recently, so-called big data
systems, also known as key-value storage systems and NOSQL systems
Many legacy applications still run on database systems based on the hierarchical
and network data models.
Some experimental DBMSs are based on the XML (eXtended Markup Language)
model, which is a tree-structured data model. These have been called native XML
DBMSs.
ii. Based on the number of users: to classify DBMSs is the number of users supported
by the system. Single-user systems support only one user at a time and are mostly
used with PCs. Multiuser systems, which include the majority of DBMSs, support
concurrent multiple users.
iii. Based on the number of sites: they can be classified as centralized and distributed. A
centralized DBMS can support multiple users, but the DBMS and the database reside
totally at a single computer site. A distributed DBMS (DDBMS) can have the actual
database and DBMS software distributed over many sites connected by a computer
network. Big data systems are often massively distributed, with hundreds of sites.
The data is often replicated on multiple sites so that failure of a site will not make
some data unavailable
iv. Based on the types of software: at various sites, they can be classified as
Homogeneous or Hetrogeneous.
v. Based on the purpose: they can be classified as specific purpose and general
purpose.
Other than the above mentioned criterias, classification can be made based on the cost,
based on the access path etc.
1.15 USING HIGH-LEVEL CONCEPTUAL DATA MODELS FOR DATABASE

DESIGN OR MAIN PHASES OF DATABASE DESIGN
What are the different phases of a database design?
or
Write a note on using high level conceptual data models for database design.
The main phases involved in the design of databases is as shown below

A simplified diagram to illustrate the main phases of database design
The first step is called as the requirements collection and analysis phase. During this
phase, the database designers interview the database users to understand their
expectations. At the end of the first phase, the database designers generate the data
requirements and functional requirements of the application.
In parallel with specifying the data requirements, it is useful to specify the known
functional requirements of the application. These consist of the user defined operations
(or transactions) that will be applied to the database, including both retrievals and
updates.
In software design, it is common to use data flow diagrams, sequence diagrams,

scenarios, and other techniques to specify functional requirements.
Once the first step has been completed, the second step involves the functional analysis of
the functional requirements and the preparation of the CONCEPTUAL DESIGN using the
data requirements. This phase includes the creation of ENTITY TYPES, RELATIONSHIPS
and CONSTRAINTS.
The third step in the database design is the actual implementation of the database using a
COMMERCIAL DBMS such as ORACLE or relational (SQL) model. This phase generate the
conceptual schema is transformed from the high-level data model into the implementation
data model. This step is called logical design or data model mapping.
The last step is the physical design phase, during which the internal storage structures,
file organizations, indexes, access paths, and physical design parameters for the database
files are specified. In parallel with these activities, application programs are designed and
implemented as database transactions corresponding to the high level transaction
specifications.
1.16 ENTITIES, ENTITY TYPES, ENTITY SETS, ATTRIBUTES, AND KEYS
ENTITIES and ATTRIBUTES : are the basic objects of an ER-MODEL. Entity represents a
“THINGS’ in the real world which has an independent existence.
Each entity has attributes. Attributes are properties that more fully describe an entity.
Eg: the EMPLOYEE entity would be described by the name, age, address, salary, sex
etc. which become the attributes of the enity.
Types of Attributes:
What are the different type of attributes. Explain.
1. Composite Attributes
2. Simple (Atomic) Attributes
3. Single-Valued Attributes
4. Multi valued Attributes
5. Stored Attributes
6. Derived Attributes
7. Complex Attributes
1. Composite attributes are such attributes which can be divided into smaller sub-parts.
These sub-parts would represent more basic attributes.
For example, the address attribute can be further divided into street no. city, state, zipcode
etc.

2. Simple/Atomic Attributes: Attributes that cannot be further subdivided are called as

simple or atomic attributes.
Ex: the sex attribute cannot be further subdivided and hence is an atomic attribute.
3. Single valued attributes: Most attributes have a single value for a particular entity;
such attributes are called single-valued.
Ex: Age is a single-valued attribute of a person
4. Multi valued attributes: Most attributes have a multi-value for the same property; such
attributes are called Multivalued. Ex: color : {red, blue} ,phone_no
5. Derived attribute: In some cases, the value of one attribute can be obtained using the
value of another attribute.
Ex: AGE attribute can be derived by subtracting the date of DOB from the current
DATE
6. Stored attribute: the attribute that cannot be obtained using the value of another
attribute is called as the stored attribute. or entered directly to relative attribute entities.
Ex: date of birth attribute is the stored attribute.
7. Complex Attributes: this attribute in general, composite and multivalued attributes can
be nested arbitrarily. We can represent arbitrary nesting by grouping components of a
composite attribute between parentheses ( ) and separating the components with commas,
and by displaying multivalued attributes between braces { }. Such attributes are called
complex attributes.
Eg: A complex attribute: Address_phone

{Address_phone({Phone(Area_code,Phone_number)},Address(Street_address
(Number,Street,Apartment_number),City,State,Zip) )}
Both Phone and Address are themselves composite attributes.
What is NULL value?

In some cases, a particular entity may not have an applicable value for an attribute.
For example, the college-degree attribute is applicable only to such employee who are
educated up to the college level. For such situations, a special value “NULL” value must be
used.

ENTITY TYPES:
Define the terms entity types and entity set.
Key Attributes of an Entity Type. An important constraint on the entities of an entity type
is the key or uniqueness constraint on attributes. An entity type usually has one or more
attributes whose values are distinct for each individual entity in the entity set. Such an
attribute is called a key attribute, and its values can be used to identify each entity
uniquely. Each key attribute has its name underlined inside the oval
ENTITY SET: A entity set is a set of entities of the same type that share the same
properties or attributes
ENTITY TYPE
OR
Collection of entity is called entity set.
Ex:
ENTIT
Y
SET
VALUE SETS (Domains) of Attributes. Each simple attribute of an entity type is

associated with a value set (or domain of values), which specifies the set of values that
may be assigned to that attribute for each individual entity .

Ex: if the range of ages allowed for employees is between 16 and 70, we can specify the
value set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16
and 70.
INITIAL CONCEPTUAL DESIGN OF THE COMPANY DATABASE
1. Identifying all entity sets

2. Identifying attributes with all entity sets (aware of different attributes)
3. Identifying feasible relationship terms
4. Identifying cardinality ratios
5. Identifying participating constraints
6. Identifying participating roles(if any)
Entity types for the COMPANY database.
We can identify four entity types—one Corresponding to each of the four items in the
specification
1. An entity type DEPARTMENT with attributes Name, Number, Locations,
Manager, and Manager_start_date. Locations is the only multivalued
attribute.
2. An entity type PROJECT with attributes Name, Number, Location, and

Controlling department. Both Name and Number are (separate) key
attributes.

3. An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address,

Salary,Birth_date, Department, and Supervisor. Both Name and Address may
be composite attributes. components of Name—First_name, Middle_initial,
Last_name—or of Address.
4. An entity type DEPENDENT with attributes Employee, Dependent_name,

Sex, Birth_date, and Relationship (to the employee).
1.17 RELATIONSHIP TYPES, RELATIONSHIP SETS, ROLES, AND

STRUCTURAL CONSTRAINTS
What is meant by Relationship Type and Relationship Sets?
RELATIONSHIPS: A relationship relates two or more distinct entities with a specific

meaning OR is an association among entities.
Entity does not exists in isolation
Ex: EMPLOYEE John works on the Pro-X PROJECT,

What is meant by Degree of a Relationship type?

The degree of a relationship type is the number of participating entity type.
If a Unary/Recursive Relationship: when an association maintained with a single entity
BINAY RELATIONSHIP: maintained with two entities
TERNARY RELATIONSHIP: maintained with three entities
Note: although higher exixts, they are not specifically named
RELATIONSHIP TYPES: A Relationship type R among n entity types E1,E2,E3…….En,

defines a set of associations among entities..
RELATIONSHIP SETS: R is a set of relationships instances ri where each ri associates n

individual entities.
Example:

The above fig shows

Some instances in the WORKS-FOR realtioshp set which represents a relationship type
betweeen EMPLOYEE and DEPARTMENBT.
ROLES
What is a participation role? When it necessary to use role names in the description of
relationship types?
The role name signifies the role that the participating entities play in each relationship. For
example consider the EMPLOYEE and DEPARTMENT entities as given below-
In the above example, the role name is works for and it signifies that the employee works
for the particular department. However role names are not compulsorily required when
the participating entities are distinct. However, in some cases where the participating
entities are same , role name becomes essential for distinguishing the meaning of each
participating the meaning of each participation. Such relationship are called as “
RECURSIVE RELATIONSHIP”
EMPLOYEE SUPERVISION

Int the above example, the EMPLOYEE entity participates twice in SUPERVISION. i.e once
in the role of a supervisior and next in the role of a supervisee. Such relationships are called
as Recursive relationships in which the role names becomes very essential.
CONSTRAINTS
There are two types of relationship constraints

1.Cardinality Ratio
2.Participation Constraint
CARDINALITY RATIO:
The cardinality ratio of a binary relationship specifies the maximum number of
relationship instances that an entity can participate in the possible cardinality ratios
for binary relationship types are
1. One to One (1:1)
2. One to Many (1:N)
3. Many to One(N:1)
4. Many to Many (M:N)
 ONE TO ONE (1:1):An example of 1:1 binary relationship is MANAGES, which

relates a department entity to the EMPLOYEE who manages the department. This
represents the constraint that at any point in time, an employee can manages one
department only and a department can have one manager only
Ex: employee manages department

 Many to one (N:1 ):binary relationship is the WORKS-FOR, which relates a

DEPARMENT entity to EMPLOYEE entity. This represents the constraints that at any
point in time, a DEPARTMENT may have many employees but an EMPLOYEE works-
for only one department.
Ex: employee worksfor department
Many to many (M:N):An example of M:N binary relationship is WORKS-ON which relates
the EMPLOYEE entity to PROJECT entity. This represents the constraint that at any point in
time an employee may work on more than one PROJECT and that a PROJECT also can have
more than one EMPLOYEE.

Ex: many Employees works on many project
PARTICIPATING CONSTRAINTS
What is meant by participation constraints? Explain.

Participation constraint specifies whether the existence of an entity depends on its being
related to another entity via a relationship type, there are 2 types of participation
constraints namely
1. Total participation (existence Dependency)
2. Partial participation
TOTAL PARTICIPATION: if the company policy states that every employee must work for
a department, then an employee entity can exist if it participates in the WORKS-FOR
relationship. Thus , the participation of EMPLOYEE in WORKS-FOR is called TOTAL
PARTICIPATION(which is also called as existence dependency). Total participation is
represented by double lines in an ER-DIAGRAM
Ex: PROFESSOR Teaches CLASS
EMPLOYEE WORKS DEPARTMENT

FOR
PARTIAL PARTICIPATION: We do not expect every EMPLOYEE to manage a department

and hence the participation if EMPLOYEE in MANAGES relationship is partial. Partial
participation is represented by single line an ER DIAGRAM.
EMPLOYEE MANAGE DEPARTMENT

S
Partial participation

NOTE:
Write a note on the structural constraints of relationship types.
If this question is asked in the exams, then discuss about both i) cardinality ratio ii)
participation constraints.
1.18 WEAK ENTITY

When is the concept of a weak entity used in data modeling? Define the terms owner
entity type, weak entity type, identifying relationship type and partial key.
A weak entity is one that meets two conditions:

1. The entity is existence-dependent; that is, it cannot exist without the entity with
which it has a relationship.
2. The entity has a primary key that is partially or totally derived from the parent
entity in the relationship.
Entities that do not have key attributes of their own are called as “weak entities”. On the
other hand, strong entities are such entities which have a key of their own.
A weak entity is identified through another strong entity in combination with one of its
attribute such a strong entity is called as the “identifying or owner entity type”. The
relationship type that relates a weak entity type to its owner is called as the “identifying
relationship”.
A weak entity normally has a “ partial key” which is an attribute(or set of attributes) that
can uniquely identify weak entities that are related to some owner entity.
In ER- Diagrams, both a weak entity type and its identifying relationship are distinguished
by surrounding their boxes and diamonds with double lines. Further the partial key
attribute is underlined with a dashed line.
Consider following example

e_ph
e_sex
date
e_name
e_dob month
ssn
year
EMPLOYEE
has
DEPENDENT
sex
name dob relation
date month year

1.19 SUMMARY OF NOTATIONS FOR ER-DIAGRAMS

List the summary of notations for ER-diagrams also discuss the naming convention used for
ER scheme diagram

The naming conventions used in the ER schema diagram are-

i. One should choose names that convey as much as possible the meanings attached to
them in the ER-schema.
ii. Normally singular names are chosen for entities rather than plural ones.
iii. Normally entities and relationship names are in upper case letters where as
attributes names are initial letter capitalized. Role names would be in lower case.
iv. As a general practice, given a narrative description of the database requirements,
the nouns appearing in the narrative tend to give rise to entities where as verbs
tend to indicate relationships.
v. Another naming consideration involves choosing binary relationship names to make
the ER diagram of the schema readable from left to right and from top to bottom.

1.20 GENERALIZATION AND SPECIALIZATION
1. Generalization: Where entities are clubbed together to represent a more generalized

view. or In which we suppose the differences among several entity types, identify their
common features and generalize them into a single super calss.
vehicle id price license_plate_n

o
vehicle
car truck
No_of_axles
Max_speed No_of_passen Tonnage

ger

Specialization: is the opposite of generalization. in specialization a group of entities is

divided into sub group based on their characteristics
Age
Name gender
PERSON
IS A
student Teacher
roll no
emp_id
1.21 ER DIAGRAMS
MOVIE DATABASE

COMPANY DATABSE/EMPLOYEE DATABASE

AIR LINE DATABASE

BANKING DATABASE

MUSIC DATABASE

LIBRARY MANAGEMENT DATABASE

HOSPITAL MANAGENET DATABASE

1.22 Example of Other Notation: UML Class Diagrams

Module 1 Final

Uploaded by

Copyright:

Available Formats

Module 1 Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 1 Final

Uploaded by

Copyright:

Available Formats

Introduction to DBMS

DATABASE MANAGEMENT SYSTEM

Databases and Database Database System Concepts Conceptual Data Modelling

Sudarsanan D Assistant Professor, CITECH-ISE. Page 1

What is a Database? Explain.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 2

What is Database Management System (DBMS)?

1.2 DATABASE-SYSTEM APPLICATIONS (AN EXAMPLE)

Databases are widely used. Here are some representative applications:

1.3 CHARACTERISTICS OF THE DATABASE APPROACH

A number of characteristic distinguish the database approach from the traditional

Sudarsanan D Assistant Professor, CITECH-ISE. Page 4

CONTRASTING DATABASE AND FILE SYSTEMS

2) Self-describing nature of the database system

Sudarsanan D Assistant Professor, CITECH-ISE. Page 5

3) Insulation between Programs and Data,and Data Abstraction

In traditional file processing approach, data definition is a part of the application

This characteristic of the database that allows program-data independence and

4) Support of Multiple Views of the Data

5) Sharing of data and multi-user transaction

1.4 ACTORS ON THE SCENE

1) Database Administrators (DBA)

What are the responsibilities of a DBA?

Sudarsanan D Assistant Professor, CITECH-ISE. Page 6

What are the responsibilities of the database designer?

i. Database designers are responsible for identifying the data to be stored in

ii. He is his responsibility to communicate with all prospective database users

iii. Sophisticated end users: includes engineers, scientists, business analysts

iv. Stand-alone users: maintain personal databases by using ready-made

interfaces. An example is the user of a tax package that stores a variety of

4) System Analysts and Application Programmers (Software Engineers)

1.5 WORKERS BEHIND THE SCENE

1.6 ADVANTAGES OF USING THE DBMS APPROACH

i. Controlling Redundancy in data storage This redundancy in storing the same

Sudarsanan D Assistant Professor, CITECH-ISE. Page 8

v. Providing Storage Structures and Search Techniques for Efficient Query

ix. Representing Complex Relationships among Data A database may include

Sudarsanan D Assistant Professor, CITECH-ISE. Page 9

have the capability to represent a variety of complex relationships among the

x. Enforcing Integrity Constraints Most database applications have certain

1.7 A BRIEF HISTORY OF DATABASE APPLICATIONS

 Early Database Applications:

(Hierarchical database model)

Sudarsanan D Assistant Professor, CITECH-ISE. Page 10

Network model (popular in mainframe computer)

The disadvantages of network model are as follows:

Relational Model based Systems:

Sudarsanan D Assistant Professor, CITECH-ISE. Page 11

1.8 DATA MODELS, SCHEMAS, AND INSTANCES

A data model—A data model is a relatively simple representation, usually graphical, of

Sudarsanan D Assistant Professor, CITECH-ISE. Page 12

relations, constraints, transformations, and other constructs with the purpose of

Categories of Data Models

Database Schema and Database State

Database State: Refers to the content of a database at a moment in time.

Sudarsanan D Assistant Professor, CITECH-ISE. Page 13

1.9 THREE-SCHEMA ARCHITECTURE AND DATA INDEPENDENCE

Create table EMP

Sudarsanan D Assistant Professor, CITECH-ISE. Page 14