Nothing Special   »   [go: up one dir, main page]

Ais Elect - Reviewer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

1

DATA ANALYTIC
Data Analytic is the application of data science DATA ACQUISITION AND PREPARATION
approaches to gain insights from data.
OBJECTIVES
DATA ECOSYSTEM AND LIFE-CYCLE  Understand data types and sources before
initiating the process of acquisition,
Data Ecosystem - refers to the programming
preparation and analysis.
languages, packages, algorithms, cloud computing
services, and general infrastructure an organization  Understand how data are organized in an
uses to collect, store, analyze, and leverage data. accounting information system.
Data Life-Cycle - describes the path data takes from  Understand how data are stored in a
when it’s first generated to when it’s interpreted into Relational Database Management System.
actionable insights. This life cycle can be split into  Explain and apply extraction, transformation,
eight steps: generation, collection, processing, and loading (ETL) techniques.
storage, management, analysis, visualization, and
interpretation.

1. UNSTRUCTURED DATA is qualitative data in the


form of text files, audio files, video files.
 Documents, Photos, Audio Files, Text Streams,
Emails, Video
4 TYPES OF ANALYTIC
2. STRUCTURED DATA is quantitative data in the
1. Descriptive Analytics – looks at data to examine, form of numbers and values.
understand, and describe something that’s already  Numerical or Quantitative Data
happened.  Discrete Data, Continuous Data
2. Diagnostic Analytics – goes deeper than  Categorical or Qualitative Data
descriptive analytics by seeking to understand “why”  Normal Data, Ordinal Data
behind what happened.
 Normal Data:
3. Predictive Analytics – relies on historical data,  UNIVARIATE – observation on single variabl
past trends, and assumptions to answer questions
about what will happen in the future.  BIVARIATE – observation on two variables
 MULTIVARIATE – observation on more than
4. Prescriptive Analytics – identifies specific one variable.
actions an individual or organization should take to
reach future target or goals.

DATA ANALYTICS MODEL


I - Identify the questions
M - Master the data
P - Perform test plan
A - Address and refine results
C - Communicate insights
T - Track outcomes
2
DATA SOURCES There are three aspects of data in the relational
database model, namely:
1. Internal Data – are collected from the
organization itself. It relates to the activities or 1. Data Structure
transactions performed within the organization e.g.
2. Data Integrity
sales, financial, employee, stocks etc.
3. Data Manipulation
2. External Data – are collected from outside the
organization and relate to the environment in which
the organization operates. PEST and competitors.

DATA ACQUISITION
Data acquisition involves obtaining access to
and collecting data.

Note: Involve audit team from the design stage of the


IT Systems (e.g., Data Acquisition System), this
would facilitate acquisition of data in the requisite DATA STRUCTURE is a specialized format for
format. To ensure this, field offices would need to organizing, processing, retrieving and storing data.
convey the data requirements for audit to the DATA INTEGRITY – the degree to which the data is
concerned entities at the stage of important system unimpaired.
developments thereby facilitating access to requisite
DATA QUALITY – the degree to which the data is “fit
data when the system is operational.
for purpose” for the correct operation of the
application.
COLLECTION OF DATA
DATA COLLECTION is the systematic approach of DATA SECURITY – the degree to which the details
gathering and measuring information from a variety of the data are protected.
of sources to get a complete and accurate picture of
an area of interest. Threats to a dataset’s integrity:
 Human error
Note: The IT system should be studied and
 Inconsistencies across format
understood while collecting data, which would
facilitate identification and requisition of relevant data.  Collection error
These can be complete databases, selected tables  Cybersecurity or internal privacy breaches
out of the databases, selected data fields of tables in
the databases or data pertaining to specific DATA MANIPULATION is the process of changing
criteria/condition for a particular period, location, or altering data in order to make it more readable
class, etc. This may be obtained in FLAT FILE or and organized. It also alters relationships between
DUMP FILE FORMATS. data items.
Relational Database Management System Benefits of RDBMS
(RDBMS) 1. Completeness.
A database management system is a software 2. No redundancy.
consisting of several programs. This management
system is used for database creation, control, 3. Business rules are enforced.
maintenance etc. 4. Communication and integration of business
processes.
3
Characteristics of RDBMS ETL OF DATA
1. Data entry can be done easily by creating tables.  (STEP 1 & 2) EXTRACTION
2. Data entry can be controlled with the help of data  (STEP 3 & 4) TRANSFORMATION
validation.
 (STEP 5) LOAD
3. Arithmetic operations can be performed on
numeric data. Data Extraction is the process of collecting or
retrieving disparate types of data.
4. Easily develop application software/programs.
5. Based on the data, the required charts or graphs Data transformation is the process of converting,
can be created, and interesting reports can be cleansing, and structuring data.
created by adding images. Data loading is defined as a copying data from
6. Users can easily exchange information from one one electronic file or database into another
database to another database.
Extraction, Transformation, and Loading of Data
7. User can find the required information very easily.
Objectives:
8. Easily create and print reports and labels in
various formats.  Understand how data are stored in a relational
database.
9. Use the graphical facilities (Object linking and
embedding) of Windows to integrate graphics into  Explain and apply extraction, transformation,
reports and create graphical data entry forms. and loading (ETL) techniques. Insert: (check
nalang
10. Data can be easily imported from other programs
and relationships can be established with files from Note: Storing data in a rational database ensures
other programs. that data are complete, not redundant, and that
business rules are enforced; it also aids
DATA PREPARATION THE IMPACT CYCLE communication and integration across business
 Identify the question. processes.
 Master the data. Different Attributes in RDBMS Tables
 Provide the meaning. 1. Every column must be both unique and relevant to
 Actionable recommendations. the purpose of the table.
 Communicate insights. 2. Each table must have a primary key to ensure that
each row is unique.
 Track outcomes.
3. Each table have a foreign key to create the
relationship between and among tables.
4. Each table contain a descriptive attributes that
provide actual business information.

ETL Process Steps

1. (EXTRACT) Determine the purpose and scope of


the data request Purchase Order Table
2. (EXTRACT) Obtain the data
3. (TRANSFORM) Validate the data for
completeness and integrity
4. (TRANSFORM) Sanitize the data
5. (LOAD) Load the data in preparation for data
analysis
4
Supplier Table Common ways in cleaning the extracted and
validated data
 Remove headings or subtotals
 Clean leading zeroes and nonprintable
characters.
 Format negative numbers.
EXTRACTION OF DATA
 Correct inconsistencies across data, in general.
Step 1: Determine the Purpose and Scope of the the
data request LOADING OF DATA
 What is the purpose of the data request? Step 5: Loading the data for data analysis The last
 What risk exists in data integrity? and simplest step in mastering the data.
 What other information will impact the nature,
timing, and extent of the data analysis? ANOTHER TOPIC
Step 2: Obtain the data MODELING AND EVALUATION
 How will data be requested and/or obtain? 1. Descriptive Analytics - “What happened?”
 From whom do you request the data? HINDSIGHT
 Where are the data located?
 What specific data are needed? 2. Diagnostic Analytics – “Why did this happen?”
 What tools will be used to perform tests and why? INSIGHT
3. Predictive Analytics – “What might happen in the
Example Standard Data Request Form future?” INSIGHT
4. Prescriptive Analytics - “What should we do
next?” FORESIGHT

TRANSFORMATION OF DATA
Descriptive Analytics
Step 3: Validating the data for completeness and
 Examination of data or content to answer the
integrity
question “What happened?” Or alerting on
Step 4: Cleaning the data “What is going to happen”, using traditional
business intellence (BI) and visualizations such
Steps in validating the extracted data as pie charts, bar charts, line graphs, tables, or
generated narratives.
 Compare the number of records.
 Compare descriptive statistics for numeric fields.
 Validate date/time fields.
 Compare string limits for text fields.
5
Diagnostic Analytics
It is a form of advanced analytics that examines data
or content to answer the question, “Why did it
happen?”. The goal of the diagnostic analytics is to
help you locate the root cause of the problem.

Types of Diagnostic Analysis


 COMPARATIVE ANALYSIS (Cost Benefit
Analysis)
 VALUE BASED ANALYSIS
 CORRELATION ANALYSIS

Predictive Analytics Natural Language Processing


Predictive analytics is the use of data, statistical
algorithms and machine learning techniques to
identify the likelihood of future outcomes based on
historical data. The goal is to go beyond knowing
what has happened to provide a best assessment of
“What will happen in the future”.

Types of Predictive Analysis


 Time Series Analysis (Impact of Decisions on
Outcomes)
 Regression Analysis (Quantifying casual
relationships among variables)

Prescriptive Analytics
Finding the best course of action in a scenario with Key takeaways:
the available data. It’s related to both descriptive and
prescriptive analytics but emphasizes actionable
insights instead of data monitoring. “How can we
make it happen?” “What shall we do next?”

Modern Tools for Prescriptive Analytics


 Machine Learning Algorithm

Various approaches to data analytics include looking


at what happened (descriptive analytics), why
something happened (diagnostic analytics), what is
going to happen (predictive analytics), or what
should be done next (prescriptive analytics). Data
analytics relies on a variety of software tools ranging
from spreadsheets, data visualization, and reporting
tools, data mining programs, or open source
languages for the greatest data manipulation.

You might also like