4 1 COPAI-Method

COPAI Method

This lesson contains how to data management through statistics or COPAI
Method (Collection, Organization, Presentation, Analysis, and Interpretation of data).

Learning Outcome:
At the end of this lesson the learners should be able to use a variety statistical
tool to process, manage numerical data and predict certain conditions.

Learning Content:

It is a branch of Mathematics that deals with the scientific Collection,
Organization, Presentation, Analysis, and Interpretation (COPAI) of data in order to
obtain useful and meaningful information to support a decision that one makes when
faced with a problem or an inquiry. Statistics can be divided into two major areas:
Descriptive Statistics and Inferential Statistics.

Descriptive Statistics Inferential Statistics

▪ Descriptive statistics comprises ▪ Statistical inference, on the other hand,
the statistical methods dealing consists of the methods involved with the
with the collection, tabulation analysis and interpretation of data that will
and summarization of data, so enable the statistician to develop
as to present meaningful meaningful inferences about the data.
information. ▪ Inferential statistics is used to make
▪ Descriptive statistics is used to predictions or comparisons about a larger
say something about a set of group (a population) using information
information that has been gathered about a small part of that
collected only. population.
▪ It utilizes numerical and ▪ It draws conclusions like decisions,
graphical data methods to look predictions, or generalizations about the
for patterns in the data set. data set.

Both subfields are interrelated; while descriptive statistics organizes the

collected data in a systematic manner, statistical inference analyzes the data and
enables one to produce significant inferences about it.

Population vs Sample

Population is the totality of all the elements or persons for which one has an
interest at a particular time, and it is denoted by N. Sample is a subset of a population,
denoted by n.

Parameter vs Statistics

Parameter is an any statistical information or attribute taken from a population.

Statistics is an any estimated of statistical attributes taken from a sample. Parameter
is some numerical or nominal characteristics of a population while statistics is a
numerical or nominal characteristic of a sample.
Types of Data

According to Value:
▪ Quantitative data These are numerical information obtained from counting
or measuring that which can be manipulated by any fundamental operation
▪ Qualitative data, These are descriptive attributes and characterized by
categorical responses.

According to Representation:
▪ Numerical data are represented by numbers.
▪ Categorical data have labels (i.e. words). (For example, a list of the products
bought by different families at a grocery store would be categorical data, since
it would go something like {milk, eggs, toilet paper, etc.})


Variables are statistical quantity that is capable of assuming several values. A

variable is used to stand for something which does not have a permanent value, thus
make different types of variables:
▪ Discrete Variables are quantities that can assume finite values only.
▪ Continuous Variables are quantities that can assume infinite values.

Levels of Measurement

▪ Nominal Level of measurement is characterized by data that consist of

names, labels or categories and the data cannot be arranged in an ordering
scheme. (Example: collection of “yes, no and undecided” responses)
▪ Ordinal Level of measurement involves data that may be arranged in some
order, but differences between data values either cannot be determined or are
meaningless. (Example: Mike ranked 1st; Sally ranked 4th; Ruth ranked 8th
and Rey ranked 12th)
▪ Interval Level of measurement is like the ordinal level but meaningful
amounts of differences can be determined between data. (Body temperature
in Celsius Scale)
▪ Ratio Level of measurement is actually the interval level modified to include
the inherent zero starting point. (Height of pine trees along Session Road)


Sampling Techniques
To ensure the validity of conclusions or inferences from the sample to the
population, the following sampling techniques are employed:

1. Simple Random Sampling – simplest method for this sampling is by lottery.

2. Stratified Random Sampling – to avoid in the selection of samples, stratified
random sampling is used wherein the population is divided into categories or

To obtain the sample size from the population we can make use of the
Sloven’s Formula:
n= , where N is the population, e is the margin of error
1 + Ne2
(either 5% or 1%)
3. Cluster Sampling – the total population is divided into a number of relatively
small areas, and some of these areas are randomly selected for inclusion in the
overall sample.
4. Systematic Sampling – an element of randomness can be introduced by using
random numbers to pick the unit with which to start.

▪ Textual Method. Collected data are presented in narrative and paragraph

▪ Tabular Method. Collected data are orderly arranged and presented in
rows and columns for an easier and more comprehensible comparison of
▪ Graphical Method. Collected data are presented in visual or pictorial


1. Stem-and-Leaf Display
is a table in which the first row is considered as the stem, and each digit to the
right of the stem can be thought as the leaf. Example

Tens Ones
2 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 8, 9
3 0, 0, 0, 1, 1, 1, 1, 3, 3, 3, 5, 6, 6
4 0, 2, 3, 4, 5, 6
5 7, 9, 9

2. Frequency Distribution Table

is a table that displays the frequency of various outcomes in a sample. Each
entry in the table contains the frequency or count of the occurrences of values within
a particular group or interval, and in this way, the table summarizes the distribution
in the sample.


a. Class limits (Apparent Limits) – the highest and lowest values describing a
class. In 31 – 40, 31 is the lower class limit while 40 is the upper class limit.
b. Mid-Value – is the midpoint between the class limits of each class. It can be
computed by adding both the lower and upper class limits and dividing by 2. In
31 – 40, the midpoint is 35.5.
c. Range – is the difference between the upper class limit and the lower class
limit plus 1. Example the class interval of 11 – 20 years old is 9 because 20
minus 11 is 9 + 1 = 10. This means that there are 10 age class included in the
class interval of 11 – 20.
d. Class Boundaries (Real Limits) – the upper and lower values of a class for
group frequency distribution whose values has additional decimal place more
than the class limits and ends with the digit 5.
e. Class Frequency – the number of observations or number of occurrences
under each class interval.
f. Total Frequency – the total number of occurrences or observations in all the
class intervals combined

Most people find visual representations to be useful in highlighting information

obtained from sample observations. The information presented in a frequency
distribution table can be more easily grasped if it is presented in a graphical format.
There are various graphical means to visualize a frequency distribution – histograms,
pie charts, frequency polygons and cumulative frequency polygon (Ogive) are among
the most popular.

▪ A graph in which the classes are marked on the
horizontal axis (x-axis) and the class frequencies
on the vertical axis (y-axis). The height of the
bars represents the class frequencies and the
bars are drawn adjacent to each other.
Nevertheless, the histogram focuses on the
frequency of each class whatever information is
contained in the observation.

Frequency Polygon
▪ A graph that displays the data using points
which are connected by lines. The frequencies
are represented by the height of the points at the
midpoints of the classes. The vertical axis
represents the frequency of the distribution
while the horizontal axis represents the
midpoints of the frequency distribution.

Pie Charts
▪ Displays the absolute (or relative) frequencies of
the class intervals as sectors of a circle. Each
sector in a pie chart corresponds to a class
interval; the ratio of the area of the sector to the
area of the circle (i.e., the ratio of the measure of
the sector's central angle to 360) is equal to the
relative frequency of the class interval.
Name: ______________________________________________ Score: ___________

Course/Year/Section/Major: ________________________ Date: ____________

Directions: Classify where does the given examples below belong to:

A. Type of Variables
_________________ 1. Among the 50 installed computers in the
laboratory, 14 are not working.
_________________ 2. The varying speed of Typhoon Yolanda in a day.
_________________ 3. ATM account numbers.
_________________ 4. All rational numbers in number set.
_________________ 5. Number of enrollees in the School Year 2019-2020.

B. Type of Data (by value)

_________________ 1. Pizza sliced into 8 pieces.
_________________ 2. Every type of attitude of every Elementary
_________________ 3. Half kilo of meat bought in the Market.
_________________ 4. Nine-Point Hedonic Scale used to describe the
Taste of Cookies.
_________________ 5. Level of spice put in a Noodle soup.

C. Level of Measurement
_________________ 1. Pizza sliced into 8 pieces.
_________________ 2. Every type of attitude of every Elementary
_________________ 3. Half kilo of meat bought in the Market.
_________________ 4. Nine-Point Hedonic Scale used to describe the
Taste of Cookies.
_________________ 5. Level of spice put in a Noodle soup.

Directions: Illustrate the following data below by (a) Stem-and-Leaf Display, (b)
Frequency Distribution Table, i = 6, (c) Histogram and (d) Frequency
Polygon. Use separate sheet of white paper.

A monthly figure on the number of mail (in thousands) that were dropped in the
Post Office within two years.

80 75 64 66 70 78 73 73
67 76 101 73 75 77 84 80
97 62 76 78 82 75 84 77
68 73 70 78 85 81 78 87
83 78 108 86 77 76 118 117
78 67 84 79 73 86 75 79
Score 0 1 3 5
- no answer - correct - correct - correct
For every and answer wrong answer and answer and
number computation computations computations computations
only with relevant

