DW Concepts
DW Concepts
DW Concepts
Hanu
Agenda
OLTP Vs OLAP
Modeling Techniques
User Profile
Top down approach
Bottom up approach
OLTP (Contd)
Transactional database require a highly
normalized database design to achieve
performance goals and to optimize on storage
space
These databases need to record, on a real-time
basis, every transaction that the organization
enters into
What is OLAP ?
An organizations success also depends on its
ability to analyze data (through views and reports)
and make intelligent decisions that potentially
affect its future. Systems that facilitate such
analyses are called
On Line Analytical
Processing (OLAP) systems
In other words...
OLTP systems are
Fragmented
Not integrated.
Difficult to access.
Disparate sources.
Disparate platforms.
Poor data quality.
Redundant data.
Difficult to understand.
Data warehouse
A Data Warehouse is a copy of the enterprise
operational data, suitably modified to support the
needs of analytical processes and stored outside
the operational database.
According to Bill Inmon, known as the father of
Data Warehousing, a data warehouse is a subject
oriented, integrated, time-variant, nonvolatile
collection of data in support of management
decisions.
OLAP Vs OLTP
Data warehouse database
Designed for analysis of
business measures by categories
and attributes
Optimized for bulk loads and
large, complex, unpredictable
queries that access many rows
per table
Loaded with consistent, valid
data; requires no real time
validation
Supports few concurrent users
relative to OLTP
OLTP database
Designed for real-time business
operations
Optimized for a common set of
transactions, usually adding or
retrieving a single row at a time
per table
Optimized for validation of
incoming data during
transactions; uses validation
data tables
Supports thousands of
concurrent users
OLAP Servers
(Tier 2)
Clients
(Tier 3)
e.g., MOLAP
Semistructured
Sources
Data
Warehouse
extract
transform
load
refresh
etc.
Analysis
serve
Query/Reporting
serve
e.g., ROLAP
Operational
DBs
serve
Data Marts
Data Mining
Characteristic of D/W
User Profile
Statisticians (2%)
Knowledge workers (15%)
Information Consumers (83%)
Customer Behavior
Corporate Customer
Customer Service
Accounts
Settlements
Partner
Supplier
Competitor
Marketing
Targets
Star Schema
Star Schema
Snowflake Schema
Snowflake Schema
Star or Snowflake
Level of hierarchies
Surrogate Key
Date and Time
Fact tables
ETL
Extract, Transform and Load process may be
described as the process of selecting, migrating,
transforming, cleansing and converting mapped
data from the legacy environment to data
warehouse environment.
Extraction
Push strategy
Pull strategy
Transformation
Transformation involves applying complex filters,
removing the inconsistency between data from
different sources, conditional transforms, complex
calculations to create derived data etc. Cleansing
of data could be an important part of the
transformation process
Loading
Loading involves the insertion of data into the
target system, that is, the data warehouse. Loading
is the last step before the users see the data. It
involves populating the fact and dimension tables
as well as aggregation tables that are part of the
physical data model
Loading approach
Transform and Load
Load and Transform
Transform while Loading
Issues in Loading
Volume and frequency of loading
Disk space
Scheduling
Data Marts
A data mart is a repository of data gathered from
operational data and other sources that is designed
to serve a particular community of knowledge
workers. In scope, the data may derive from an
enterprise-wide database or data warehouse or be
more specialized. The emphasis of a data mart is on
meeting the specific demands of a particular group
of knowledge users in terms of analysis, content,
presentation, and ease-of-use
OLAP
ROLAP
MOLAP
HOLAP
Cognos
Business Objects
Power Analyzer
Microsoft Analysis service
Micro strategy
DB2 OLAP Server
Hyperion OLAP Server
References
http://192.168.121.14/asp/Search/DispDoc.asp?Do
cNo=8703&KCURating=8.61&ContentType=Inte
rnal+Literature
http://www.datawarehouse-training.com
http://www.datawarehousing.com
http://www.caworld.com/proceedings/2000/dat
a_warehousing/ws006pn/sld001.htm
http://sdgcomputing.com
http://www.dmreview.com
Thank You
Hanu