Nothing Special   »   [go: up one dir, main page]

EC 203 Tutorial 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

The University of the South Pacific EC203 Economic Statistics

Tutorial 2 ANS Data Collection- Sampling


Learning Outcomes
 Key objective of sampling.
 Different techniques of Probability (or Random) Sampling; including simple random, stratified, and cluster.
 Issues involved in selection of sampling techniques i.e. strength and weakness of each method.
 Application of these techniques in commerce.
 Errors in data collection.
Questions
1. Define and Differentiate between a census and a sample. And state advantages and disadvantages of both
methods of data collection.

Census: Collecting data about the whole population or all items in the population.
Sample: selecting a sub-set of a whole population is often done for reasons of cost and
practicality.

Advantages of Sampling: Disadvantage of Sampling:


 Sampling can save money  May not be as reliable as Census
 Sampling can save time

Advantages of Census Disadvantage of Census


 Accurate and Reliable  Costly
 Time consuming

2. How can we overcome disadvantages of sampling to a certain extent? Using Random sampling plans. For
random sampling plans it is possible to quantify reliability of results as selection process is unbias.
3. Discuss advantages of random sampling.
Random sampling: when every unit of the population has the same probability of being
selected to the sample.
There are three types of random sampling including simple random sampling, , stratified,
and cluster sampling.
Advantage:
1. It avoids the bias introduced by some non-random selection method. Meaning random
sampling is free from biasness as every unit of the population has the same probability of
being selected to the sample OR
2. It is possible to quantify the reliability of the result, that is we can apply statistical
methods and get reliable results.
 Other advantages of types of random sampling: - stratified sampling- as well as acquiring
information about entire population, we can also make inferences within each stratum or
compare strata. Cluster sampling- useful when it’s difficult or costly to develop a complete
list of the population member. Also useful wherever the population elements are widely
dispersed geographically. (Slide #19-20.)
4. What are the important factors to be considered in choosing sampling techniques?

 Availability of sample frame – if frame available the any of the 3 technique can be
applied but if not then only cluster.
 Budget, time and extent of accuracy required.- one of the cheapest is cluster , but if no
budget constraint then can choose any.
 Population characteristics:
 whether small, -then go for census
 similar- then can go for small sample size
 dispersed or nearby, if dispersed then cluster may be recommended.
 Possible biases involved etc.

5. Tevita a newly graduated accountant from USP starts job at PWC as an external auditor. His first job was to
audit account receivables section of MH supermarket. MH has a total of 8500 debtors, who have
purchased goods on credit from MH. Out of these 8500 debtors, 3400 owes less than $5000. 4250 owes
between $5000 - $20000 and 850 owes more than $20000. Tevita was requested to audit 500 accounts to
ensure that account receivable balance is true and fair in the financial reports. The MH supermarket
accountant assisted Tevita by providing list of these debtors in order of amount owed ranging from lowest
to highest numbered from 0001 to 8500.

a) What is population of interest and sample size?


Population: is the group of all items of interest to a statistics practitioner.
Sample: subset of data drawn from the population. Population of interest – all debtors at MH.
Population Size (N) = 8500 debtors
Sample size (n) - 500 debtors

b) Using simple random sample state who will be the first five debtors to be selected.( Use random
Numbers Table provided below ,Select the sample starting at the first digit of row 2 and working
along the row.)
Table 1: Random Numbers

7898 8002 4418 2747 8079 4993 6863 9542 0949 4531 6955 5826 9971 6233 7887
8640 3204 6906 5719 1116 5982 9532 2422 8333 8828 9002 2680 1928 8532 3600
4431 3453 3070 5239 3168 6490 0274 8443 9984 7503 0263 8086 3372 5454 1599
5868 4764 0158 1225 5558 7840 9394 8126 6974 1561 4765 0758 8717 6979 6306
8514 6959 7775 5844 5147 9173 4558 9107 0453 6119 2915 6586 9670 6580 5202
3137 1170 0345 6099 6352 6074 6142 1898 3657 1924 5625 3556 8178 0103 6107
3490 3349 7010 2045 6123 6271 8981 5274 2183 9820 0957 3988 6747 3508 8914
Simple Random Sampling
Steps
1) Get population frame (list containing details about all items in population) e.g. list of all
8500 debtors at MH.
2) Number each element with equal number of digits equal to number of digits of
population. Count number of digits in population (N) E.g. N= 8 5 0 0 ( Four Digits)
Hence the first debtor in the list will be debtor # 0001 and last will be debtor # 8500.
3) Since total population is 8500, i.e. 4 digits we will choose 4 digit numbers starting from
first digit of row 2 and working along the row.
ANS: 8640 (NO) 3204 6906 5719 1116 5982
Note: If the selected number is more than 8500; we ignore the value. Likewise if the
selected number is repeated again we ignore the value.
c) If Tevita decided to take a stratified sample based on amount owed by the debtors,
according to proportions in the population , then out of 500 accounts to be selected
how many to be selected from those owing less than $5000.
3400 debtors owe less than $5000
Proportion of population owes less than $5000 = 3400/8500 = 0.40 OR 40%
» 40% of 500 = 200 debtors
(NOTE : If value for sample size come sin decimals round it to next whole number)
d) State first three accounts that will be selected for review from those account owning
less than $5000? Select the sample starting at the second digit of row 3 and working
along the row.
No. of debtors owing < $5000 = 3400
1. Now our sample frame is list of debtors at MH owing < $5000. N = 3400.
2. Hence first debtor in the list will be debtor # 0001 and last will be debtor # 3400.
3. Choose 4 digits starting at second digit of row 3.
*The four digit number should be less than 3400. If so, it should be selected as a part of
sample.
*Ignore repeated numbers and those that are more than 3400.
Ans:
4313(NO) 4533(NO) 0705 2393 1686
Hence first 3 accounts that will be audited from those owing less than $5000 will be
account # 0705, 2393, 1686.

e) Comment: is sample selection according to proportions in the population compulsory in


stratified sampling?
Not compulsory, in stratified sampling selection from each sub-population can be
proportionate or disproportionate since key goal is to make sample more representative
For above case, I would recommend Tevita to do disproportionate sample that is to choose
a bigger sample from the group owing greater than $20 000 and the smaller from the first
group that is owing less than $5000. Because the group owing higher amount risk of fraud
or error is likely to be higher in that group, when we choose sample proportional to
population we are treating each sub group equally, that is the first group represent 40% of
population so taking 40% of sample from there, however since last group involves larger
debts so taking relatively larger sample would be preferred in this case.

f) What other useful ways that can be used to stratify the account receivables account.
What rule to be applied while choosing a subgroup or stratum in stratified sampling?
In stratified random sampling population is separated into mutually exclusive (meaning
non- overlapping) sets or strata and then simple random sampling is done from each
stratum.
Other useful satisfaction methods:
• Age of debtors (time (in days) debtors take to pay their debts)
• Department (based on different branches)
• Types of debtors
g) A politician intends to collect sample data on voters age to estimate the population
mean age of voters in her electorate. Unfortunately, she does not have a complete list
of voters. State with a reason a sampling plan that would be suitable for her purposes.

Cluster sampling because this sampling method does not require the use of a sample frame.
Note: that with stratifies sampling, the population is divided into groups and some elements
are selected from each of the sub-groups. With cluster sampling, the population is divided into
sub-groups and all the elements are selected from selected sub-groups.

h) Differentiate between sampling and non-sampling errors. Discuss ways to overcome


these errors.
Sampling error refers to differences between the sample and the population, because of the
specific observations that happen to be selected. Sampling error is expected to occur when
making a statement about the population based on the sample taken. Increasing the sample
size will reduce the sampling error. Also can be reduce by making sample more representative
using stratified sampling.

Non-sampling errors occur due to: Mistakes made along the process of data acquisition,
Sample observations being selected improperly. Increasing sample size will not reduce this type
of error.There are three types of non-sampling errors:

1) Errors in data acquisition:


Arises from the recording of incorrect responses, due to:
• incorrect measurements being taken because of faulty equipment,
• mistakes made during transcription from primary sources,
• inaccurate recording of data due to misinterpretation of terms, or
• Inaccurate responses to questions concerning sensitive issues.
2) Non-response errors:
Refers to error (or bias) introduced when responses are not obtained from some members of
the sample, i.e. the sample observations that are collected may not be representative of the
target population.
As mentioned earlier, the Response Rate (i.e. the proportion of all people selected who
complete the survey) is a key survey parameter and helps in the understanding in the validity of
the survey and sources of non-response error.
3) Selection bias: Occurs when the sampling plan is such that some members of the target
population cannot possibly be selected for inclusion in the sample.
To reduce Non-sampling errors it is recommended to develop skills and competency of staff
involved in data collection process, also internal control like double checking etc.
i) Discuss the difference between an observational study and an experimental study.
Observational study is one in which measurements representing a variable of interest are
observed and recorded, without controlling any factor that might influence their values
e.g. measuring the height of a tree in the rainforest over time.
Experimental study is one in which measurements representing a variable of interest are
observed and recorded, while controlling factors that might influence their values. e.g. measuring
the yield of different type of rice using a certain amount of fertilizer (control factor).

You might also like