Ebook - Statistics Fundamentals For Business Analytics

e-Book on ‘Statistics Fundamentals for Business Analytics’ by Dr.
Devesh Bathla
1. What is Statistics, its types and key characteristics?
Statistics is something, which can be used by anyone and in any career or field. It is about population
and sample, Quantitative and qualitative data and many more things.
Two types :
a) Descriptive statistics (used to summarize and help to describe the data)

b) Inferential statistics (used for predicting and making decisions)
Characteristics of statistics
• It consists of aggregates of facts

• It is effected by many causes
• It should be numerically expressed
• It must be enumerated or estimated accurately
• It should be collected in a systematic manner
• It should be collected for a predetermined purpose
• It should be capable of being placed in relation to each other
2. Five Stages of Statistics
3. Types of Data and its general classification ?
a) Primary data
Data which is collected for the first time and data is fresh. The data which is collected by user itself
directly from the people and respondents he/she need. Interviews, surveys and etc.
b) Secondary data
Data which is already been published or collected by someone else. The data can be related to your
topic but not exactly for your topic only.
Classification of data
a) Geographical Data (area wise e.g.:- city, state)

b) Chronological Data (on the basis of time)
c) Qualitative Data (basis on some attributes or quality e.g.:- sex, color of hair, etc.)
d) Quantitative Data (according to some numerical characteristics e.g.:- height, weight, profits, sales)
4. Sampling - Way through which you can choose that how you are going to select the people or
respondents from whom you will collect data and why you choose this method to make sample and
collect data.
5. Types of Sampling?
a) Probability Sampling - in which sample from a larger population are chosen using a method based
on the theory of probability. For a participant to be considered as a probability sample, he/she must be
selected using a randomselection.
b) Non – Probability Sampling - where the samples are gathered in a process that does not give all the
individuals in the population equal chances of being selected.
Probability Sampling
Simple Random sampling - In a simple random sample (SRS) of a given size, all such subsets of the
frame are given an equal probability.
Cluster sampling - In this the sample is divided in the equal cluster on the basis of demographic or
geographic parameters.
Systematic sampling - Systematic sampling (also known as interval sampling) relies on arranging the
study population according to some ordering scheme and then selecting elements at regular intervals
through that ordered list.
Stratified sampling- When the population embraces a number of distinct categories, the frame can be
organized by these categories into separate "strata." Each stratum is then sampled as an independent
sub-population, out of which individual elements can be randomly selected.
Non – Probability Sampling
Convenience sampling – It is a non-probability sampling technique where samples are selected from
the population only because they are conveniently available to researcher.
Judgmental and purposive sampling

In judgmental sampling, the samples are selected based purely on researcher’s knowledge.
Researchers choose only those who he feels are a right to participate in research study.
Snowball sampling – It helps researchers find sample when they are difficult to locate. Researchers
use this technique when the sample size is small and not easily available. Once the researchers find
suitable subjects, they are asked for assistance to seek similar subjects to form a considerably good size
sample.
Quota sampling - It is a sampling technique in which a researcher gather data by making groups of
respondents.
6. Data Presentation
Data can be presented in 2 forms tabular or graphical. Data presentation is very important in business
analysis as by viewing , studying and analyzing the data a business analysis going to predict the future
or give solutions to the problems.
Tabular
• Tabular data is a way to show data in table form.
• In this you get data in systematic form of rows and columns.
• It simplifies the complex data.

Graphical
• Graphical representation is a visual display of data and it is more effective way to represent data.
• Graphical representation makes data more understandable to others.
• There are many different types of graphical representation such as pie chart, line chart, bar
diagram, histogram, scatter plot and others.
Bar Graph
Histogram
Line Chart
Pie Chart
7. Measures of Central Value
Different types of series

a) Individual series
b) Discrete series

c) Continuous series
Mean - average of the data, which is derives at by total number of values divided by number of values.
Median - mid term of data or mid – value.

• Median = N +1 / 2 or N / 2
• Median = L +( N / 2 - cf ) x I
f
Mode - most frequently occurring number found in a set of numbers.

8. Measures of Dispersion
Range = L – S
Coefficient of range = L - S / L + S
Mean Deviation
MD =∑ f |D|
N
Standard Deviation
9. What is correlation and what are its types ?
Correlation - shows us that if the two variable are related to each other or not and how much related to
each other.
There are 3 different types of correlation: -

Positive and Negative
• Positive correlation is that there is a relationship in both the variable they are varying in the
same direction.
• Negative correlation is that there is no relationship in both the variable they are not varying in
the same direction. If one is increasing other one is decreasing.
Simple, Partial and Multiple

• Simple correlation means the study of two variables only.
• Partial correlation there are more than two variables but we choose only two variables and
study them.
• Multiple correlation means the study of more than two or three variables.
• Linear and Non – Linear
Coefficient of Correlation
When r = +1 ,it means there is perfect positive relation between the variables.
When r = -1 , it means there is perfect negative relation between the variables.
When r = 0 , it means there is no relationship between the variables.

10. What is regression and its uses ?
Regression : It tells us about the relation between mean value of one variable and other variable
values.
It also tells us that how one factor affects other factors.
USES OF REGRESSION
• Predictive analysis
• operation efficiency
• supporting decisions
• correcting errors & new insights
11. Central limit theorem
The central limit theorem is a theorem which is used in statistics to make data normally distributed and
to do that sample size should be minimum 30 and as the sample size increase the data will also get
more normally distributed.
Bell curve - When the data will get normally distributed you can find it out by making a bell curve if
the data make a proper bell curve then it is normally distributed but sometimes bell curve can be
skewed also, it can be left side skewed or right side skewed.
12. Other terms
Error-term
• It is also known as residuals.
• Error term means the differences between the values predicted and actual values.
Time series
It is also another statistics topic through which we can study the trend and can also predict future
sometimes.
P - Value
P-value is also very important in statistics it tells us that if data is statistically significant or not, if p -
value is less then 0.05 then it is said that data is statistically significant but the value is more than that
then data is statistically insignificant and it cannot be used further for any analysis.

Ebook - Statistics Fundamentals For Business Analytics

Uploaded by

Copyright:

Available Formats

Ebook - Statistics Fundamentals For Business Analytics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ebook - Statistics Fundamentals For Business Analytics

Uploaded by

Copyright:

Available Formats

e-Book on ‘Statistics Fundamentals for Business Analytics’ by Dr.

a) Descriptive statistics (used to summarize and help to describe the data)

• It consists of aggregates of facts

2. Five Stages of Statistics

3. Types of Data and its general classification ?

a) Geographical Data (area wise e.g.:- city, state)

Non – Probability Sampling

Judgmental and purposive sampling

Different types of series

Median - mid term of data or mid – value.

Mode - most frequently occurring number found in a set of numbers.

9. What is correlation and what are its types ?

There are 3 different types of correlation: -

Simple, Partial and Multiple

• Linear and Non – Linear

When r = -1 , it means there is perfect negative relation between the variables.

When r = 0 , it means there is no relationship between the variables.

11. Central limit theorem

12. Other terms

You might also like