Nothing Special   »   [go: up one dir, main page]

0% found this document useful (0 votes)
48 views3 pages

Untitled Document-1

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 3

Module 1

Data
In computing, data is information that has been translated into a form that is efficient for
movement or processing. Relative to today's computers and transmission media, data is
information converted into binary digital form. It is acceptable for data to be used as a
singular subject or a plural subject.

1: factual information (as measurements or statistics) used as a basis for reasoning,


discussion, or calculation
2: information output by a sensing device or organ that includes both useful and irrelevant or
redundant information and must processed to be meaningful
3: information in numerical form that can be digitally transmitted or processed.

data preparation
Data preparation is the process of gathering, combining, structuring and organizing data so it
can be used in business intelligence (BI), analytics and data visualization applications.

Steps in the data preparation process


The process of preparing data includes several distinct steps. There are variations in the
steps listed by different data preparation vendors and data professionals, but the process
typically involves the following tasks:

1 Data collection. Relevant data is gathered from operational systems, data warehouses and
other data sources. During this step, members of the BI team, other data professionals and
end users gathering data themselves should confirm that the data is a good fit for the
objectives of the planned applications.

2 Data discovery and profiling:The next step is to explore the collected data to better
understand what it contains and what needs to be done to prepare it for the intended uses.
Data profiling helps identify patterns, inconsistencies, anomalies, missing data, and other
attributes and issues in data sets so problems can be addressed.

3 Data cleansing: In this step, the identified data errors are corrected to create complete and
accurate data sets that are ready to be processed and analyzed. For example, faulty data is
removed or fixed, missing values are filled in and inconsistent entries are harmonized.

4 Data structuring:. At this point, the data needs to be structured, modeled and organized
into a unified format that will meet the requirements of the planned analytics uses.

5 Data transformation and enrichment: In connection with structuring data, it often must be
transformed to make it consistent and turn it into usable information. Data enrichment and
optimization further enhance data sets as needed to produce the desired business insights.
6 Data validation and publishin :. To complete the preparation process, automated routines
are run against the data to validate its consistency, completeness and accuracy. The
prepared data is then stored in a data warehouse or other repository and made available for
use.

Types of data analytics


There are 4 different types of analytics. Here, we start with the simplest one and go further to
the more sophisticated types. As it happens, the more complex an analysis is, the more
value it brings.

1 Descriptive analytics
Descriptive analytics answers the question of what happened. Let us bring an example from
ScienceSoft’s practice: having analyzed monthly revenue and income per product group,
and the total quantity of metal parts produced per month, a manufacturer was able to answer
a series of ‘what happened’ questions and decide on focus product categories.

Descriptive analytics juggles raw data from multiple data sources to give valuable insights
into the past. However, these findings simply signal that something is wrong or right, without
explaining why. For this reason, our data consultants don’t recommend highly data-driven
companies to settle for descriptive analytics only, they’d rather combine it with other types of
data analytics.

2 Diagnostic analytics
At this stage, historical data can be measured against other data to answer the question of
why something happened. For example, you can check ScienceSoft’s BI demo to see how a
retailer can drill the sales and gross profit down to categories to find out why they missed
their net profit target. Another flashback to our data analytics projects: in the healthcare
industry, customer segmentation coupled with several filters applied (like diagnoses and
prescribed medications) allowed identifying the influence of medications.3 Predictive
analytics
Predictive analytics tells what is likely to happen. It uses the findings of descriptive and
diagnostic analytics to detect clusters and exceptions, and to predict future trends, which
makes it a valuable tool for forecasting. Check ScienceSoft’s case study to get details on
how advanced data analytics allowed a leading FMCG company to predict what they could
expect after changing brand positioning.

Diagnostic analytics gives in-depth insights into a particular problem. At the same time, a
company should have detailed information at their disposal, otherwise, data collection may
turn out to be individual for every issue and time-consuming

3 Prescriptive analytics
The purpose of prescriptive analytics is to literally prescribe what action to take to eliminate a
future problem or take full advantage of a promising trend. An example of prescriptive
analytics from our project portfolio: a multinational company was able to identify
opportunities for repeat purchases based on customer analytics and sales history.

4 Predictive analytics
Predictive analytics tells what is likely to happen. It uses the findings of descriptive and
diagnostic analytics to detect clusters and exceptions, and to predict future trends, which
makes it a valuable tool for forecasting. Check ScienceSoft’s case study to get details on
how advanced data analytics allowed a leading FMCG company to predict what they could
expect after changing brand positioning.

Difference between Structured and Unstructured data

1 Structured data –
Structured data is data whose elements are addressable for effective analysis. It has been
organized into a formatted repository that is typically a database. It concerns all data which
can be stored in database SQL in a table with rows and columns. They have relational keys
and can easily be mapped into pre-designed fields. Today, those data are most processed in
the development and simplest way to manage information. Example: Relational data.

2 Unstructured data –
Unstructured data is a data which is not organized in a predefined manner or does not have
a predefined data model, thus it is not a good fit for a mainstream relational database. So for
Unstructured data, there are alternative platforms for storing and managing, it is increasingly
prevalent in IT systems and is used by organizations in a variety of business intelligence and
analytics applications. Example: Word, PDF, Text, Media logs..

You might also like