Nothing Special   »   [go: up one dir, main page]

POLC 6314 - Homework 4 - DELAO

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

De La O, Katy (1)

POLC 6314 – Policy Research Methods I: Introduction to Statistics


Fall 2021
Homework 4
Due date: October 14, 2021. Please type your homework and upload to
Blackboard by 11.59pm on October 14, 2021. Make sure to show your work! You
don’t have to type pretty equations, just make sure you show how you got your
answers.

Multiple Choice Questions (16 points)


(1) What z-value is associated with a 95% confidence interval?
a. 1.28
b. 1.65
c. 1.96
d. 2.58

(2) Which of the following is NOT a property of the student’s t distribution?


a. It is symmetric.
b. Its exact shape (i.e., spread) is characterized by the degrees of
freedom.
c. As the sample size grows, it gradually approaches the normal
distribution.
d. All of the above are properties of the t distribution.

(3) A 95% confidence interval for the average can be interpreted to mean which
of the following?
a. If all possible samples are taken and confidence intervals calculated,
95% of those intervals would include the true population mean
somewhere in their interval.
b. You can be 95% confident that you have selected a sample whose
interval includes the population mean.
c. Both answers (a) and (b) are correct.
d. Neither answer (a) nor (b) is correct.

(4) If, for a sample in which the subjects are randomly chosen, the mean
income is $45,000, the sample size is 1600, the standard deviation is $4,000,
and the standard error of the mean is $10, what, approximately, is the 95%
confidence interval for the population mean?
a. $41,000 to $49,000.
De La O, Katy (2)

b. $37,000 to $53,000.
c. $44,800 to $45,200.
d. It is impossible to know because we do not know if the incomes are
normally distributed.

(5) Suppose a reputable pollster reports that, on the basis of a random sample of
American adults, the President's approval rating is 50%, and that the margin
of error is plus or minus 4%. What is the correct interpretation of this
reported result?
a. Roughly half of the people who were invited to participate in the
survey agreed to do so.
b. The true level of approval in the population is equal to 50%, with only
a 4% chance that it is equal to something other than 50%.
c. We can be approximately 95% confident that the true level of
approval in the population as a whole is somewhere between 46% and
54%.
d. We have not been given enough information to draw any solid
interpretation of these facts.

(6) The process of statistical inference is defined as:


a. Trying to learn about a characteristic about a broader population based
on observations of only a sample from that population.
b. Validating that the observed sample is indeed representative of the
population as a whole.
c. Admitting that the researcher has not collected enough data to draw a
conclusion.
d. Collecting data from the entire population of cases.

(7) Which of the following accurately describes a p-value?


a. A p=.01 means that there is a 1% chance that we would see the
measured relationship due to random chance
b. For the same measured relationship, a larger sample size will lead to a
smaller p-value.
c. Both (a) and (b) are correct.
d. Neither (a) nor (b) are correct.

(8) A relationship between two variables is described as ``statistically


significant'' under which of the following circumstances?
De La O, Katy (3)

a. When there is a sufficiently high p-value.


b. When there is a credible causal claim about the relationship between
the two variables.
c. Both (a) and (b) are correct.
d. Neither (a) nor (b) are correct.

Short Answer/Calculation Questions (64 points)


1. Every user of statistics should understand the distinction between statistical
significance and practical importance. A sufficiently large sample will declare very
small effects statistically significant. Let us suppose that SAT Mathematics
(SATM) scores in the absence of coaching vary normally with mean μ = 505 and
standard deviation σ = 100. Suppose further that coaching may change μ but does
not change σ . An increase in the SATM score from 505 to 508 is of no importance
in seeking admission to college, but this unimportant change can be statistically
very significant. To see this, calculate the p-value for the test of
H 0 : μ=505
H a : μ≠ 505
in each of the following situations:
a) A coaching service coaches 100 students; their SATM average score is Y = 508. (3
points)
Observed−Expected S
Z= SE SE= √ (n)
S 100
SE= √ (n) SE= √100 SE=10
Observed−Expected 508−505
Z= SE Z= 10 Z=.3

When Z=.3, on the table it is = 0.6179


(1-0.6179) x 2 = 0.7642
B/c two sided test
We fail to reject Ho because .7642 > .05 (α)

b) By the next year, the service has coached 1000 students; their SATM average score
is Y = 508. (3 points)
n=1000 Y = 508 ; μ=505 σ= 100
De La O, Katy (4)

Observed−Expected S
Z= SE SE= √ (n)
S 100
SE= √ (n) SE= √1000 SE=3.1622
Observed−Expected 508−505
Z= SE Z= 3.1622 Z=.94868

When Z=.94868, on the table it is = 0.8264


(1-0.8264) x 2 = .3472
B/c two sided test
We fail to reject Ho because .3472 > .05 (α)
c) An advertising campaign brings the number of students coached to 10,000; their
SATM average score is still Y = 508. (3 points)

n=10000 Y = 508 ; μ=505 σ = 100


Observed−Expected S
Z= SE SE= √ (n)
S 100
SE= √ (n) SE= √10000 SE=1
Observed−Expected 508−505
Z= SE Z= 1 Z=3

When Z=3, on the table it is = 0.9987


(1-0.9987) x 2 = .0026
B/c two sided test
We reject Ho because .0026 < .05 (α)

2. You want to rent an unfurnished one-bedroom apartment in Houston next year.


The mean monthly rent for a random sample of 10 apartments advertised in the
local newspaper is $1400. Assume that the standard deviation is $220. Find a 95%
De La O, Katy (5)

confidence interval for the mean monthly rent for unfurnished one-bedroom
apartments available for rent in this community. (6 points)

n=10 ; μ=$1400 σ= $220

95% Confidence Interval = 1.96

Df= 10-1=9
CV= 2.262

Se p-hat = √(p-hat (1 – p-hat) / (√n)

SE= 220/(√10)=69.57

1400 ± ( 2.262 x 69.57 ) 1400 ± ( 157.367 )

1400+157.367= 1557.367
1400 -157.367= 1242.633
(1242.633, 1557.367)
We are 95% confident that the true sample mean for rent of an
unfurnished one bedroom apartment will fall between 1242.63 and
1557.37.

3. State the appropriate null hypothesis H0 and alternative hypothesis Ha in each of


the following cases. For practice, use both one-sided (> or <) and two-sided (≠).

a) A 2002 study reported that 70% of students owned a cell phone. You plan to take
a simple random sample of students to see if the percent has increased. (3 points)

H 0 : μ=.70
H a : μ≠ .70

Or

H 0 : μ=.70
H a : μ>.70
De La O, Katy (6)

b) A university gives credit in French language courses to students who pass a


placement test. The language department wants to know if students who get credit
in this way differ in their understanding of spoken French from students who
actually take the French courses. Experience has shown that the mean score of
students in the courses on a standard listening test is 26. The language department
gives the same listening test to a sample of 35 students who passed the credit
examination to see if their performance is different. (3 points)
μ=26 , n=35
H 0 : μ=26
H a : μ≠ 26

c) An education researcher randomly divides six-grade students into two groups for
physical education class. He teaches both groups basketball skills, using the same
methods of instruction in both classes. He encourages Group A with compliments
and other positive behavior but acts cool and neutral toward Group B. He hopes to
show that positive teacher attitudes result in a higher mean score on a test of
basketball skills than do neutral attitudes. (3 points)
2 Groups:
Group A: Positive
Group B: Neutral
H 0 : μa=μb
H a : μa≠ μb

Or

H 0 : μa=μb
H a : μa> μb

4. According to a union agreement, the mean income for all senior-level assembly-
line workers in a large company equals $500 per week. A representative of a
women's group decides to analyze whether the mean income μ for female
employees matches this norm. For a random sample of nine employees, Y =410 and
s=90.
a. Test whether the mean income of female employees differed from $500 per week.
(6 points)
μ=¿$500 n= 9 Y =410 s=90

H 0 : μ=500
H a : μ≠ 500
De La O, Katy (7)

Observed−Expected S
Z= SE SE= √ (n)
S 90
SE= √ (n) SE= √ 9 SE=30
Observed−Expected 410−500
Z= SE Z= 30 Z=-3

When Z=3, on the table it is = 0.0013

Degrees of freedom= n-1 = 9-1= 8

Critical Value: 2.306

Absolute value of -3 = 3

3 > 2.306 so we must reject the null hypotheses, H 0


Which means that the mean of $410 differs drastically from the mean
income of $500. Thus, probably stating the point that women might earn less than
the average they’re suppose to earn at $500. Women are winning less than what
they’re suppose to earn compared to the avg.

5. A study based on a sample of size 25 reported a mean of 93 with a margin of error


of 11 for 95% confidence.
μ=¿93 n= 25 M.E.= 11 (95% Interval confidence – 1.96)

a. Give the 95% confidence interval. (4 points)


μ+ M . E = 93 ± 11 =
93+11= 104
93-11= 82
(82 , 104)

b. If you wanted 99% confidence for the same study, would your margin of error be
greater than, equal to, or less than 11? Explain your answer. (4 points)

A confidence interval of 99% = 2.58 compared to confidence interval of 95%


which is 1.96. The margin of error will be larger than 11.
De La O, Katy (8)

6. An exit poll of 2293 votes in the 2006 Ohio Senatorial election indicated that 44%
voted for the Republican candidate, Mike DeWine, and 56% voted for the
Democratic candidate, Sherrod Brown.
a. If actually 50% of the population voted for DeWine, find the standard
error of the sampling proportion voting for him for this exit poll. [hint:
you’ve been given the population proportion which is .50, which you
should use to calculate the population standard deviation and, in turn,
the standard error for the exit poll] (4 points)
44% Voted Republican
56% Voted Democratic
N=2293

If 50% for DeWine:

H 0 : μ=. 5 0
H a : μ≠ . 5 0

.50
√ .50(1−.50)
SE= √ p ¿ ¿ ¿ SE=
√ 2293
SE=√(.25)/ √(2293) =

SE= 0.01044

Sd= √ p (1−P¿)¿ = √ .50(1−.50)=.5

b. If actually 50% of the population voted for DeWine, would it have


been surprising to obtain the results in this exit poll? Why? Show
why/why not and explain. (6 points)

H 0 : μ=.5 0
H a : μ≠ .5 0

From above: SE=.01044

Z= (P-hat – Po) / SEo =

. 44−.50 −0.06
Z=
. 0 10 44
Z=
.0 1044
Z=-5.747
De La O, Katy (9)

CV: 1.96

Absolute value of |-5.747| = 5.747

5.747 > 1.96

Thus, we Reject the Null Hypothesis, Ho, because the


absolute value of the test statistic (5.747) is greater than the
critical value of 1.96. Which means that if DeWine actually
had 50% of people vote for him, than getting a 44% would
be highly unlikely.

7. A poll in Canada indicated that 48% of Canadians favor imposing the death
penalty (Canada does not have it). A news article cited this statistics but did not
report the sample size, but stated, “Polls of this size are considered accurate within
2.5 percentage points 95% of the time.” About how large was the sample size?
[Hint: remember how margin of error is calculated] (6 points)
48% favor death penalty
2.5% of 95% Confidence Interval (1.96)

Margin of error = CV x SE
SE= √ p ¿ ¿ ¿ SE= √ .48 ¿ ¿ ¿

.4996
0.025 = 1.96 x
√n
De La O, Katy (10)

n=1534.18
Approximately 1534 for the sample size
8. By law, an industrial plant can discharge no more than 500 gallons of waste water
per hour, on the average, into a neighboring lake. Based on other infractions they
have noticed, an environmental group believes this limit is being exceeded.
Monitoring the plant is expensive, and a random sample of four hours is selected
over a period of a week. They find the following:

Variable No. of Cases Mean SD


Waste_Water 4 1000.00 400.00

No More than 500 gallons of waste water per hour (** just according to the law-
max should be 4x500=2,000)

N=4
μ=1000
Sd= 400

a. Find the standard error. (2 points)


S 400
SE= √ (n) SE= √ 4 SE=200
De La O, Katy (11)

b. Test whether the mean discharge equals 500 gallons per hour against the
alternative that it does not.

H 0 : μ=500
H a : μ≠ 500

i. What are the null and alternative hypotheses? (2 points)


H 0 : μ=500
H a : μ≠ 500
ii. Calculate the test statistic (3 points)
500−10 00
z= =−2.5
200

Df= n-1= 4-1= 3

t-table: 3.182
CV=3.182, T-Statistic=-2.5 or absolute value of -2.5 = 2.5

The sample mean falls 2.5 sd away from Ho mean.


iii. What is your conclusion about whether you reject or not reject the null?
Explain. (3 points)

CV=3.182, T-Statistic=-2.5 or absolute value of -2.5 = 2.5

The sample mean falls 2.5 sd away from Ho mean. So


that means we must reject the null hypothesis. Anything that falls beyond
2 sd, gets more difficult to come across.

**note for professor** I kept trying to compare 3.182 and 2.5 but that
means that T-stat: 2.5 < CV:3.182, so under these standards, it would say
that I would fail to reject, not my answer just the note for you sorry**

Stata Questions (20 points)


Use the Stata data file that you used for Homework 3
(“anes_panel_2013_inetrecontact_clean.dta”) to complete the following tasks:
1. Use the commands you used (or get mine from answer key) to generate
again the variables interest and efficacy. (2 points)
gen interest=.
De La O, Katy (12)

replace interest=1 if C5_A1==5


replace interest=2 if C5_A1==4
replace interest=3 if C5_A1==3
replace interest=4 if C5_A1==2
replace interest=5 if C5_A1==1
lab define interest 1 "Not interested at all" 2 "Slightly interested" 3 "Moderately
interested" 4 "Very interested" 5 "Extremely interested"
lab val interest interest
tabulate interest

gen C5_F1_new=.
replace C5_F1_new=1 if C5_F1==5
replace C5_F1_new=2 if C5_F1==4
replace C5_F1_new=3 if C5_F1==3
replace C5_F1_new=4 if C5_F1==2
replace C5_F1_new=5 if C5_F1==1
tabulate C5_F1_new C5_F1

gen C5_F2_new=.
replace C5_F2_new=1 if C5_F2==5
replace C5_F2_new=2 if C5_F2==4
replace C5_F2_new=3 if C5_F2==3
replace C5_F2_new=4 if C5_F2==2
replace C5_F2_new=5 if C5_F2==1
tabulate C5_F2_new C5_F2

gen efficacy = C5_F1_new + C5_F2_new


tabulate efficacy

2. Create new variable interest_binary which recodes interest into a


binary/dichotomous variable. The new variable interest_binary will take 0
for those "Not interested at all", "Slightly interested", and "Moderately
interested” and 1 for the other two categories of interest (the old variable).
Label variable 0 “Not interested” and 1 “Interested.” Tabulate new variable.
Copy and paste code and output. (4 points)

//just to check//
codebook interest
codebook efficacy
sum interest
De La O, Katy (13)

sum efficacy

//2. Create new variable interest_binary which recodes interest into a


binary/dichotomous variable. The new variable interest_binary will take 0 for those
"Not interested at all", "Slightly interested", and "Moderately interested" and 1 for
the other two categories of interest (the old variable).

generate interest_binary=interest
generate intertest_binary=.
codebook interest_binary

replace interest_binary=0 if interest==1 | interest==2 | interest==3


replace interest_binary=1 if interest==4 | interest==5

lab define interest_binary 0 "Not interested" 1 "Interested"


lab val interest_binary interest_binary

tab interest_binary
De La O, Katy (14)

3. Using the new variable (interest_binary_) you created in #2, test whether a
majority can be considered interested in politics. Copy and paste stata code
and output.

//3. 3. Using the new variable (interest_binary_) you created in #2, test whether a
majority can be considered interested in politics.

prtest interest_binary=0.5

a. What are the null and alternative hypotheses for this test of
significance? (1 point)

H 0 : μ=.5 0
H a : μ≠ .5 0

b. What is the test statistic value? (1 point) -7.5801


c. What is the p-value for this test? (1 point) 0.0000, it is something that
is close to zero so the value is very small.
d. Interpret results. Do you reject or fail to reject null, and what does this
mean substantively (i.e., do you have statistically significant evidence
that majority of Americans are interested in politics or not)? (4 points)
De La O, Katy (15)

 We will have to reject the null hypothesis. The rule is we


reject Ho if p-value < alpha level. In this case, alpha level
is higher, given that the p-value is so small (less than 0)
… this means that we have statistically significant
evidence that the majority of Americans are interested in
politics.
4. Test whether the mean of efficacy is 5.
//4. 4. Test whether the mean of efficacy is 5
ttest efficacy=5 //or
ttest efficacy=5, level(95)

a. What are the null and alternative hypotheses for this test of
significance? (1 point)

H 0 : μ=5
H a : μ≠ 5

b. What is the test statistic value? (1 point)


-16.9499
c. What is the p-value for this test? (1 point)
0.000… so it is very small as well.
De La O, Katy (16)

d. Interpret results. Do you reject or fail to reject null, and what does this
mean substantively? (4 points)
 We would reject the null hypothesis,
H 0 : μ=5
This means that the mean of efficacy is not equal to 5.
And the alternative hypothesis is true.

You might also like