Big Data
Big Data
Big Data
Component
Functionality
Operational
Canned and static reports geared toward reporting needs that change
Actuate, Information Builders,
Reporting/Analytic very infrequently and the use of analytic dashboards which employ visual Business Objects
s
alerts to present status updates and areas of concern for key
performance indicators (KPIs)
Ad Hoc
Managed queries which are drawn from an environment with a defined
Querying/Reportin set of query options that can be executed
g
Online Analytical
An approach to quickly provide answers to analytical queries that are
Processing (OLAP) multidimensional in nature. At the heart of OLAP is cubes which is an
arrangement of data in arrays to allow fast analysis
Monitor critical business metrics, alert issues that need attention, and
manage indicators in order to take action faster. Allows for tracking of
performance through scorecards and collaboration with others to follow
recommended actions to improve organizational performance
Data
Mining/Forecastin
g/
Statistical Analysis
SAS, IBM
Data mining is the process of discovering meaningful new correlations, patterns, and trends
by sifting through large amounts of data stored in repositories, using pattern recognition
technologies as well as statistical and mathematical techniques. Data Mining can do
Task
Description
Explanation
Describe patterns and trends lying within data. High-quality description can often be accomplished by
exploratory data analysis, a graphical method of exploring data in search of patterns and trends.
Classification In classification, there is a target categorical variable, such as income bracket, which, for example, could be
partitioned into three classes or categories: high income, middle income, and low income. The data mining
model examines a large set of records, each record containing information on the target variable as well as
a set of input or predictor variables.
Estimation
Estimation is similar to classification except that the target variable is numerical rather than categorical.
Example, estimating the amount of money a randomly chosen family of four will spend for back-to-school
shopping this fall.
Prediction
Prediction is similar to classification and estimation, except that for prediction, the results lie in the future
Clustering
Clustering refers to the grouping of records, observations, or cases into classes of similar objects. A cluster
is a collection of records that are similar to one another, and dissimilar to records in other clusters.
Clustering differs from classification in that there is no target variable for clustering.
Association
The association task for data mining is the job of finding which attributes go together. Most prevalent in
the business world, where it is known as affinity analysis or market basket analysis, the task of association
seeks to uncover rules for quantifying the relationship between two or more attributes. Example, examining
the proportion of children whose parents read to them who are themselves good readers.
Duplicate
transactions
Data quality
Transaction
limits
File
matching
Charact
er
pattern
matchin
g
Segregation
of duties
(SoD)
Aging
Numeric
pattern
matching
Date/time
matching
10
Variance
tests
Technology
Cost of storage and computing power has decreased
exponentially
Data
data
Ingredients
Wildly increasing loads of data
Cultural Shifts as organizations learn to
What is predicted?
The kind of behavior (i.e., action, event, or
PA Application
Targeting Direct Marketing
What is Predicted:
Which customers will respond to marketing
contact.
Predictive Score
Predictive Model
A mechanism that predicts a behavior of individual,
If the individual
Is still in high school
AND
13.5%